Journal  of 

Mathematics  and  Physics 

(Founded  by  C.  L.  S.  Moore) 


Volume  XIII 

1934 


Editors 


VANNEVAR  BUSH 
FREDERICK  G.  KEYES 
DIRK  J.  STRUIK 


MANUEL  S.  VALLARTA 
NORBERT  WIENER 
FREDERICK  S.  WOODS 


Managing  Editor 
PHILIP  FRANKLIN 


Published  by  the 

MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY  PRESS 


CONTENTS 


On  New  Properties  of  the  Arithmetical  Means  of  the  Partial  Sums  of  Fourier 

Series.  By  Leopold  Fejir .  1 

A  General  Survey  of  the  Theory  of  Adiabatic  Invariants.  By  Tullio  Levi- 

Civila .  18 

A  Formal  Theorem  on  the  Derivatives  of  a  Series  of  Zonal  Harmonics.  By 

Robert  F,  H.  Chao .  41 

On  the  Rational  Solutions  of  the  Matrix  Equation  P{X)  —  A.  By  M,  H. 

Ingraham .  46 

On  Causality,  Statistics  and  Probability.  By  Eberhard  Hopf .  51 

Non>Riemannian  Dynamics  of  Rotating  Electrical  Machinery.  By  Gabriel 

Kron .  103 

The  Stress  Distribution  of  Longitudinal  Welds  and  Adjoining  Structures. 

By  William  Hovgaard .  105 

Formulae  Giving  the  Change  in  Green’s  Function  and  in  the  Conjugate 

Function.  By  J.  O.  E»te» .  249 

Regions  of  Positive  and  Negative  Curvature  on  Closed  Surfaces.  By  Philip 

Franklin . 253 

Effect  of  Surface  Discontinuity  on  the  Distribution  of  Potential.  By  //.  B. 

Phillipe .  261 

A  Modification  of  Levi-Civita’s  Wave  Equation.  By  Baneeh  Hoffmann .  268 

A  General  Theory  of  Electric  Wave  Filters.  By  H.  W.  Bode .  275 

A  Six  Color  Problem.  By  Philip  Franklin .  363 

Three  Mathematical  Methods  of  Analysing  Polarised  Light.  By  Dorothy  W. 

Weeke .  371 

A  Study  of  Sixteen  Coherency  Matrices.  By  Dorothy  W,  Weeke . 380 

The  Criterion  for  a  Stationary  Point  of  One  of  a  Set  of  Implicit  Functions. 

By  Preecott  D.  Croat .  387 

The  Theory  of  Linear  Matrix  Transformations  with  Applications  to  the 

Theory  of  Linear  Matrix  Equations.  By  Lawrence  Harrie . 399 

Stress  Functions.  By  H.  B.  Phillipe .  421 

Kn  IN  Aan-i  Possessing  an  5f-Tuply  Orthogonal  Net  Along  the  Lines  of 

Curvature.  By  N.  Kaplan .  426 

The  Indeterminate  and  Composite  Product  of  Matrices.  By  G.  W.  King _  433 


m, 

w. 


•  y  if  •> 

-  ,  ^.;  ‘ ,  ._■ 

;;f.  .-^  ' “  =-t",‘'“  -V 


il. 


‘i--- . 


>T'/:jiTyjn  :.  ■  ■ 

.j  -.-  '  i  ^  'ty'- i\  "  •  ■  ■  i’.  ■.  '<*4^  *.ri /«ii*  *■.  i>, -W#  i.>  •■♦li'iq-^jH  ^’1}^  ■  ;:;'  ...  ,  ■ 

'  >,<  t-*.«^-»ji  ifH 

\  *  \  <  » .  <  :f  !*-.VI>J  Ji- »'5<  W.  (  .*  V«  I*  vir'l»*t^»  A  ■  '■  '  ,  ,  ; 

'.  '  ■  •  ''',^‘'! 

j-'  .f^nf  «.  ••■riri<*  •  ’*♦  vfr  .-  '  <  <w*  f.'wfV  iivrriii'i  h  ■  ■■_.  .j'.'  f 

4<-  '..  Tl  .5  t'«4o'A 

W  Wi  ■4‘’‘  >|4:  li>  ►'WSwr  vil*  <•!.« 

Iff*  ;.  .  I  •%»**■, ^t^\ 

ff.  ’,  i.H  - ’’i'*  i'll,;  ■  .f  ^  '  •/ 

^  •'■'t.'’.  •»  .  ffc- II'  l.y  nw-f  M.  'i 

«*.'{  .  .  ,  .  .  •••/>.  A  fu,  _  •  ' 

to-j.  .♦  fr  -j)  vj  ijH.iiidfr.  Ut;#  '!■'•  ■' ■•■•>»K?V''J«^1  na^tX-i  «i»  T  ■ 

’•I'i  S  yit  s  .  ;  ■ 

•*'''•'■  »■•  '  >'*•  '-i  fcf  *  !*ir?M  ••*1*  '  '  -1.  '•j i^iilc';-’'**'* ^  ^ 

'.■•V*  .*>  •.  <f*j  ,'r*i:i...»i»i/M  ‘.  .  ’'•  ■  ., 

-,  ,■ .  I  '■  ••■ »» itiM*  r*-%4r  I  iv-  f  '  ’><'*  -y'  f  '.‘4<»«m‘:^'’JS  ,  '< 

r  .■.,•■•  .  ■■ 

.U.  ,l\  .*>1  >  t» /'^C  i 

1;^.  .  .  ..  ..  .  *■'  ,’  ■  jj 

.*»•  .%.nr>^  ’[  '\  VVSK*:  ;»  yr>*f  ’ 

•<!^'f.  /?<  \i  (j**':!*  '  •  -•"»t*«<JT  Ji»nwttW^  /  ?, 

If  iif,i  '-fi  -■  ;  'aj  ’  si^‘**ki'**  i;iia  .1  wA.}  ■  i  '  ,  '  ; 

'Vi*'  ■■  .  •  ,  ‘'Ia-i’M 

♦sp**:  '*»  V  •‘»+'>».ife‘  •  ■  V  '••  I  ylk”4fi  A  '  ^ 

w«‘  >  ,i*  4  rtv  >f  M  V  ^  ipi ,  >'  ►•>••■■1.  ft  7<4t  »(;r  " 

V%.  #  ^  .  .• 

tv  i»  **.  ■  •  '  tfWHto* jif'.'  tf  -..-‘»«ij.>.  fT"  7  .r'ftT  ^iJT?  » 

.^.  '  .  «  .'«M  t  ^  nrvji^I  I  I  .{i**«ifT  * 

Jfh  ...  •  ..  •  ;ini' *  1  M  r  \^^  ■ 

*••  ■4**j*,?  "•  •  ,.;  '.i>  >' ^  .  -i'  :t>  ‘f 

'"-ii  ‘  .  ....  .  »'••■  *  ./  «f5'  nui.'^v^lO  -■  ■,  > 

t#*-'  ^  *■'  .V(4» 4  m*:4»  .•U''>  •.  Ht  •S'-n.  •  «.«;'!  fi41'  ‘ 


‘  *  '  ’fV*  4  ^- 


'  ‘'t  O 


ON  NEW  PROPERTIES  OF  THE  ARITHMETICAL  MEANS 
OF  THE  PARTIAL  SUMS  OF  FOURIER  SERIES* 

Bt  Leopold  FejAr' 


1.  The  properties  of  the  partial  sums  and  of  the  arithmetical  means  of 
Fourier  series  of  a  function  of  a  real  variable  or  of  a  power  series  of  a 
function  of  a  complex  variable,  have  been  studied  with  much  success 
during  recent  decades.  These  investigations  are  not  confined  to 
questions  of  convergence  relative  to  the  infinite  series  of  the  partial 
sums  or  their  means,  but  include  also  the  study  of  the  individual 
approximating  curves  as  compared  with  that  of  the  function.  In  this 
lecture  I  shall  consider  some  new  results  of  the  latter  type. 

2.  I  shall  begin  with  a  simple  illustration.  Let  us  consider  the 


straight  line  segment  joining  the  points  ^0,  and  (r,  0).  This  line 
segment  defines  the  linear  function 

(1)  /(x)  -  ^ 

which,  when  developed  in  a  Fourier  sine  series  on  the  interval  (0,  r),  takes 
the  following  form : 


>  Lecture  delivered  in  May,  1933  at  several  universities  in  the  Eastern  and  Mid* 
dlewestem  States. — For  the  preparation  of  the  English  text  of  this  lecture  I  am 
indebted  to  Professor  and  Mrs.  I.  A.  Barnett. 

*  Professor  of  mathematics  at  the  University  in  Budapest,  Hungary.  Bd. 

1  PHYSICS  LIBRARY 


2 


LEOPOLD  FEJER 


(2) 


fix) 


Hin  X  hin  2x  ^ 
1  2 


Bin  nx  ^ 

“l  +  .  .  .  .  , 


0  <  X  ^  T. 

3.  The  partial  sum  of  index  n  of  this  sine  series  is 

(3)  S.'»(x)  .  ..'(i)  -  +  . .  .  +  ™?i. 

I  2  n 

As  we  all  know,  these  partial  sums  have  been  very  thon)UKhly  investi¬ 
gated  and  their  properties  form  the  basis  of  the  theory  of  the  classical 
Gibbs’  Phenomenon.  I  can  therefore  be  very  brief  here  and  simply 
refer  to  the  familiar  works  of  Gibbs,  B6cher  and  RunKe. 

However,  there  is  one  property  of  the  partial  sums  (3)  of  which  I 
shall  speak  because  of  the  important  rdle  it  plays  in  what  is  to  follow. 
The  partial  sum 


sin  X  sin  2x  sin  nx 

~r  2  ■■■  n 


is  positive  for  every  value  on  the  interval  0  <  x  <  t,  no  matter  what 
positive  integral  value  n  may  have.  This  result  I  announced  without 
proof  as  early  as  1910.  Shortly  after  that  Dunham  Jackson  and  T.  H. 
Gronwall  published  proofs  of  this  theorem.  Later  I  myself  gave  a  third 
pn)of  and  E.  Landau  presented  a  fourth  proof  of  this  elementary  fact. 

4.  I  now  turn  to  the  arithmetical  mean  of  the  first  order 

(4)  ,S,<‘'(x)  *  +  •  •  •  + 

n  +  1 


of  the  sine  series  (2).  Since  the  curves  y  ■»  «»(x),  (n  «■  1,  2,  3,  . . .), 
all  lie  above  the  X-axis  fbr  0  <  x  <  r,  it  is  clear  that  the  mean  curves 

(5)  y  =  S,<‘>(x),  (n  =  1,2,3,.  ..), 

also  lie  above  the  X-axis  on  the  interval  0  <  x  <  t. 

On  the  other  hand  it  follows  readily  from  a  general  theorem  on  the 
arithmetical  means  of  the  first  order  of  Fourier  series  that 


(6)  S,“>(x)  <  ^ 

on  the  internal  0  <  x  <  t,  so  that  all  the  curves  y  —  S»<‘*(x)  lie  com¬ 
pletely  below  the  straight  line  segment  y  —  t,  0  <  x  <  t.  In  this 


i 


m’ 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS  3 

connection  we  recall  that  the  arithmetical  means  of  the  first  order 
no  longer  exhibit  the  Gibbs’  phenomenon. 

Furthermore,  as  far  back  as  1912,  I  demonstrated  that  the  mean 
curves  of  the  first  order 

y  - 

for  0  <  X  <  IT,  lie  completely  even  below  the  straight  line 

(7) 


2 


0  Ji 

Fid.  2 


In  fact,  a  simple  calculation  yields  for  the  difference -  —  5,^”(x) 

2 

the  noteworthy  formula 

(8)  ^  ~  ^  _  s  (ufj.)  ^  ^  ^  _  «o(j‘)  +  8i(x)  -b  ■  ■  ■  H-  «.(j)  ^ 

^  2  ’*^  '*2-"  n  +  l 


n  +  I  Jt  \  sin  I  / 


from  which  follows  at  once  the  inequality 

(9)  So(t)  +  «/j)  -I-  . . .  -t-  Snjx)  ^  y  -  X 

^  n  +  1  2  ’ 

n  =  0, 1, 2, 3,  . . . ;  0  <  x  <  t. 

Summarizing,  I  may  therefore  say  that  for  the  arithmetical  means  of  the 
first  order  Sn“H^)>  the  inequalities 

(10)  0  <  S,<»(x)  < 


n  =»  0, 1, 2, 3,  . . . ;  0  <  x  <  t, 


are  true. 


4 


LEOPOLD  FEJER 


5.  Let  U8  now  consider  the  arithmetical  means  (also  known  as  the 
Ceskro  means)  <S,‘*^(x)  and  of  the  2"'*  and  3"*  order,  respectively, 

of  the  simple  sine  series  (2).  These  will  likewise  satisfy  the  inequalities 

(11)  0  <  S,‘«'(x)< 


(12)  0  <  5,(*>(x)  < 

n  »  1,  2, 3,  . . . ;  0  <  x  <  t. 


In  Rcneral  the  inequalities  0  <  S,^*^(x)  < 


are  valid  for  all 


k  ^  1.  But  of  what  signihcance  are  the  arithmetical  means  of  order 
higher  than  the  first?  Just  recently  I  found  the  theorem  that  all  of  the 
arithmetical  means  of  the  3"*  order  5,<*>(x),  (n  —  1,  2,  3,  the 

series 


sin  X  sin  2x 

~T  2 


sin  nx 

•  •  +  ■  I  •  •  •  •  I 


are  convex  upwards  on  the  entire  interval  0  <  x  <  t.  The  arith¬ 
metical  means  .S,<*'(x),  <S,”’(x),  S,<*'(x),  of  the  0***,  !•*,  and  2“'* 

orders,  respectively,  of  our  series  ^  —  are,  however,  not  all  convex 

I  n 

fi*! 

upwards  on  the  entire  interval  0  <  x  <  t.  If  we  consider  in  succession 
the  infinite  sequences  of  the  arithmetical  means  of  the  0***,'  !•*,  2“**  and 
3^^  orders  of  our  series,  we  find  that  an  unexpected  change  occurs  for  the 
3"*  arithmetical  mean,  namely,  the  means  of  the  3'**  order,  are,  for 
n  —  1,  2,  3,  . . .  all  convpx  upwards  on  the  entire  interval  0  <  x  <  t. 

That  this  property  is  not  true  for  (S»^®’(x),  S,<‘>(x),  S,**’(x),  is  a  conse¬ 
quence  of  the  following  fact:  the  2"'*  derivatives  of  these  CesAro  means 
take  the  form 


(13) 


(14) 


dr’ 


cos  (2n  +  1)  - 
2 

2  sin- 
2 


+  «n(x) 


—  S  a)(x)  >=  Hin  (n  +  l)x  -f-  5,(x) 

dx*  .  .  ,x 

4  sin*  — 

2 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS 


5 


(18) 


(15) 


where 

(16)  lim  tnix)  —  lim  4»(j)  ■«  lim  *  0, 

»-•  !•••• 

and  indeed,  the  limit  is  approached  uniformly  in  every  subinterval  of 
the  interval  0  <  x  <  ir. 

With  the  help  of  these  asymptotic  formulas  one  may  discuss  the 
points  of  inflection  of  the  mean  curves  jS,<*>(x),  iS,<‘’(x),  iS,”>(x),  and 
consequently,  the  above  assertion  may  be  proved  very  readily. 


(P 

—  «,<»'( j)  -  - 

dx» 


4n  sin 


- {2  cos  -  +  cos  (2n  +  3)  -  +  n»(x)  >, 

.  I  2  2  J 

aif\*  _ 


Fio.  3 


6.  I  turn  now  to  a  second  example, 
sepnents  the  pf)ints  (0,  0),  (o,  b),  (t,  0). 


Let  us  join  by  .straight  line 
We  now  have 


(17) 


0  <  a  <  t; 


for  0  ^  X  ^  a, 

for  a  ^  X  ^  T, 
6  >  0, 


while  the  Fourier  sine  series  for/(x)  yields 


fix) 


2b 


air— a)  \ 


fsino'sinx  ,  sin  2a -sin  2x  ,  ,  sin  na -sin  nx  , 

H - r; - r.  ••  T - r - r.  •  • 


2* 


Again,  corresponding  to  this  last  “roof  function”  /(x),  we  may  state 
properties  for  the  mean  functions  S«^*’(x),  S«”’(if)F  iS,^*’(x)  quite 

analogous  to  the  previous  ones;  that  is  to  say. 


6 


LEOPOLD  PEJER 


(19) 

0  <  S,«>(x), 

(20) 

0  <  S,<‘>(x)  <  fix), 

(21) 

0<S»(»>(x)  </(x). 

(22) 

0<  S,<*>(x)  </(x). 

and  the  arithmetical  means  of  the  3'"*  order  (S»‘*’(x),  n  *  1,  2,  3,  . . 
are  all  convex  upwards  on  the  entire  interval  0  <  x  <  t. 

The  proposition  0  <  <S»<®’(x)  relating  to  the  partial  sums  of  the  sine 
series  for  the  roof  function  /(x)  is  due  to  L.  Koschmieder;  the  proof 
depends  on  the  corresponding  property  of  the  partial  sums  of  the  straight 
line  just  mentioned.  The  remaining  theorems  concerning  the  roof 
function  are  new.  I  shall  not  go  into  the  details  of  their  proofs  in  this 
lecture,  but  I  shall  have  something  to  say  later  about  the  underlying 
notions  upon  which  these  pr(K)f8  depend. 


Finally,  let  us  note  that  the  straight  line  mentioned  at  the  beginning 
of  this  lecture,  may  be  obtained  by  a  limiting  process  from  the  roof 
curve  just  considered.  In  fact,  if  b  is  kept  fixed  and  a  tends  to  0,  we 
obtain  in  the  limit,  a  straight  line  which  passes  through  the  point  (t,  0). 
If,  on  the  other  hand,  a  tends  to  r,  while  b  again  remains  constant,  we 
obtain  in  the  limit  a  straight  line  which  passes  through  the  point  (0,  0). 
In  both  limiting  cases  the  previously  stated  theorems  for  the  roof  func¬ 
tion  of  §6  are  true. 

7.  As  a  third  illustration,  let  /(x)  be  defined  by  a  polygon  having 
(ifc  -|-  1)  vertices,  Po,  Pi,  . . . ,  P*.  Let  the  polygon  lie  entirely  above  the 
axis  of  abscissae,  and  let  it  be  convex  upwards  (not  concave)  on  the 
interval  0  <  x  <  t.  Let  me  now  make  use  of  an  elementary  theorem 
which  enters  in  certain  investigations  of  W.  Blaschke,  P.  Frank,  and 
G.  Pick;  namely,  that  such  a  polygon  function  fix)  is  expressible  as  a 
finite  sum  of  roof  functions 


(23) 


fix)  =  ifoix)  +  v>i(x)  -H  ...  +  ^i(x). 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS 


7 


(By  the  additional  use  of  the  limiting  cases  mentioned  in  §6,  we  find 
that  A:  -h  1  such  roof  functions  are  needed  in  the  representation 
of  /(x)  given  by  (23);  i.e.,  as  many  as  there  are  vertices  in  the  given 
polygon.) 

From  this  representation  of  a  positive,  convex  polygon  function  as  a 
finite  sum  of  roof  functions,  follows  with  one  stroke  that  the  theorems 
which  relate  to  the  roof  function  heretofore  mentioned,  are  alsc)  true  for 
the  polygon  function  /(x).  I  shall  content  myself  merely  with  the  con¬ 
sideration  of  S,<*’(x)  of  the  polygon  function.  From  the  form  of  /(x) 
given  by  (23)  we  obtain  at  once  that  the  arithmetical  mean  iS,‘*’(x)  of 


the  Fourier  sine  series  for /(x),  which  I  now  denote  by  is  pre¬ 
cisely  the  sum  of  all  i.e,, 

I 

(24)  =  2 

Since,  as  we  have  already  seen,  the  means  Sn'*’  M  of  the  roof  func¬ 
tions  are  all  positive  and  convex  upward  on  the  interval  0  <  x  <  x, 
and  since  further,  a  finite  sum  of  such  functions  is  also  positive  and 
convex  upward,  on  the  interval  0  <  x  <  t,  we  have,  therefore,  shown 
that  iS,i**’(x)  of  our  polygon  function  is  again  positive  and  convex  upward 
on  this  interval  0  <  x  <  x. 

8.  Finally,  let  /(x)  be  an  arbitrary  function  of  x,  which  is  positive  in 
the  interior  of  the  interval  0  <  x  <  x,  and  convex  upward  (not  con¬ 
cave)  in  this  interval.  Such  a  curve  y  “  /(x)  may  be  approximated 


8 


LEOPOLD  FEJER 


as  closely  as  we  wish,  and  in  fact,  uniformly,  on  the  interval  0  ^  x  ^  r, 
by  precisely  such  inscribed  polygons  Pq  Pt  Pt  . . .  Pk  as  we  have  just 
discussed.  But  we  know  that  if  we  have  a  uniformly  convergent 
sequence  of  functions,  each  member  of  which  is  positive  and  convex 
upward  on  the  interval  0  <  x  <  v,  then  the  limit  function  will  also  be 
positive  and  convex  upward  (not  concave)  on  this  same  interval.  I 
may,  therefore,  by  an  obvious  limiting  process  in  which  n  is  kept  fixed, 
readily  obtain  the  following  theorem: 

Given  a  function  f{x)  which  is  positive  on  the  interval  0  <  x  <  x 
and  which  is  convex  upward  (non-concave),  then  the  arithmetical  means 
S,‘*>(x),  S,<‘^(x),  S,‘*»(a;),  Sn^*^ix)  of  the  0*^,  1*‘,  2“<‘,  and  3'“  orders. 


respectively,  of  its  Fourier  sine  series  development,  possess  the  following 
properties: 

(25)  0  <  S,<«(x), 

(26)  0<S,»>(x)  ^/(x), 

(27)  0  <  S»<*>(x)  ^/(x), 

(28)  0  <  5,«’(x)  ^/(x), 

for  the  interval  0  <  x  <  t,  n  —  1,  2,  3,  . . .. 

Moreover,  the  arithmetical  means  of  the  3''*  order,  S,<*’(x),  are  all 
convex  upwards  on  the  interval  (0,  t). 

This  last  property  concerning  the  convexity  is  not  true  for  the  means  of 
lower  order,  namely  iSn<*’(x),  S,‘‘’(x),  S«<*’(j^)» 

9.  In  this  connection  I  should  like  to  call  attention  to  the  interesting 


i 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS 


9 


special  case  in  which  the  cur\'e  /(x)  is  symmetrical  on  the  inter\'al 
0  <  X  <  T  with  respect  to  the  line  x  «  ^  ,  so  that 


(29)  /(t  -  x)  -  /(x). 

In  this  symmetrical  case  the  property  of  convexity  is  already  pos¬ 
sessed  by  the  arithmetical  means  iS,“’(x)  of  the  2"**  order;  that  is  to  say, 
if  fix)  is  positive  and  non-concave  as  seen  from  above,  on  the  interval 


0  <  X  <  IT,  then  the  arithmetical  means  of  the  2"'*  order,  iS,'*'(x)  of  the 
Fourier  sine  series  for  /(x)  are  all  positive  and  convex  upward  on  the 
same  interv  al. 

This  result  is  somewhat  more  profound  than  the  one  mentioned  pre¬ 
viously.  One  may  establish  this  theorem  by  means  of  the  special  case 
in  which  y  ->  fix)  represents  an  isosceles  trapezoid. 

10.  The  accompanying  Figure  9  illustrates  this  theorem  in  the  sim¬ 
plest  imaginable  special  case,  in  which  fix)  is  constant  in  the  interv'al 


10 


LEOPOLD  FEJER 


0  ^  X  ^  w  (thu8/(x)  is  non-ooncave  upward).  The  Fourier  sine  series 
of  the  function 


(30)  fix)  -  0  <  X  <  w, 

is 


(31)  fix) 


[» 


^  sin  x^  ^  sin  2x  ^  sin  3x  ^  sin  4x  sin  5x 

~~r  2  3  4  5 


The  upper  curv’e  of  this  figure  represents  the  partial  sum  of  index  12  of 
this  sine  series  (31),  i.e.,  , 


« 


^^i*“'(x)  *2  2^ 
r»l 


sin  i2v  —  1)x 
2v~\ 


The  middle  cun’e  represents  the  arithmetical  mean  of  first  order  iSij<*>(x) 
and  the  lower  curve  the  C'esAro  mean  of  second  order  iSij‘*’(x)  of  the  series 
(31).  The  interesting  curves  iSi***’(x),  iSij<*’(x)  are  already  found  in  the 
book  of  H.  S.  Carslaw  on  Fourier  Series,  in  connection  with  the  discus¬ 
sion  of  the  famous  Gibbs’  phenomenon,  which  arises  in  the  case  of  the 
partial  sums  S,<**(x),  but  disappears  for  the  means  of  first  order  S,f‘’(x). 
If  we  now  examine  the  three  cui^'es  from  the  point  of  view  of  convexity- 
concavity,  we  obeer\’e  that  both  »Si‘*’(x)  and  *Slii^‘^(x)  are  composed  of 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS 


11 


arcs  which  are  alternately  convex  and  concave  upward.  On  the  other 
hand,  the  new  curv’e  which  represents  the  CesAro  mean  of  2"'* 

order  of  index  12  for  the  series  (31),  is  convex  upward  in  the  whole  inter¬ 
val  0  <  X  <  r. 

11.  We  have  seen  that  we  arrive  at  the  general  theorems  stated  above 
essentially  by  the  use  of  the  roof  and  isosceles  trapezoid  functions. 

How  can  we  prove  the  general  theorems  for  these  special  cases?  I 
have  already  intimated  that  I  would  say  something  on  this  subject. 

In  1900  I  observed  that  the  arithmetical  means  of  the  !•*  order  of  the 
partial  sums  of  the  series 

(32)  14-2  cos  9  4-2  cos  2  9  4-2  cos  3  9  4-  ...  -1-2  cos  n  9  4-  .  . . 

are  all  non-negative  for  every  real  value  of  9. 

Recently  I  found  that  for  the  series 

(33)  0  4-  1  •  sin  9  4-  0  •  sin  29  4-  3  •  sin  39  -f  0  •  sin  49  4-  5  •  sin  59  4-  • .  • 

the  arithmetical  means  of  the  2"**  order  of  the  partial  sums  (in  fact,  both 
for  the  Ces&ro  and  for  the  Holder  means)  are  all  positive  on  the  inter\’al 
0  <  9  <  T. 

Further,  I  have  found  that  for  the  series 

(34)  0  4-  1  •  sin  9  4-  2  •  sin  29  4-  3  •  sin  39  -f-  ... 

the  arithmetical  means  of  the  3"*  order  of  the  partial  sums  (in  fact,  both 
for  the  Ces&ro  and  for  the  Holder  means)  are  all  positive  on  the  interval 
0  <  9  <  T.  (The  mean  of  index  0  forms  an  exception  in  both  series, 
since  this  mean  is  identically  0.) 

Now  these  two  theorems  concerning  the  series  (33)  and  (34)  (which 
were  long  since  considered  by  Euler)  form  the  basis  of  my  proof. 

In  addition,  I  make  use  of  an  important  lemma  which  was  suggested 
to  me  by  a  remarkable  observation  made  by  Koschmieder,  and  which  I 
have  formulated  as  follows: 

Given  a  set  of  non-negative  numbers  Xi  ^0,  ^  0,  . . . ,  X.  ^0, 

such  that  X|*  4-  Xj*  4-  ...  -f  X,*  4=  0,  then  the  inequality 


(35)  Xi  sin  X  sin  y  4-  Xj  sin  2x  sin  2y  4-  ...  4-  X,  sin  nx  sin  ny  ^  0 
is  true  in  the  interior  of  the  square 


(36) 


0  <  X  <  T 

0  <  <  T, 


if  and  only  if,  the  inequality 


(37)  Xi  sin  X  4-  2X,  sin  2x  4-  3Xi  sin  3x  4-  ...  -f-  nX,  sin  nx  ^  0 


12 


LEOPOLD  FEJER 


holds  for  all  points  of  the  interval 

(38)  0  <  X  <  T. 

When  the  left  member  of  (37)  is  actually  greater  than  0,  then  the 
left  member  of  (35)  is  also  greater  than  0. 

12.  In  the  hrst  part  of  this  lecture  we  considered  functions  /(x) 
which  are  positive  and  convex  upward  (non-concave)  on  the  inter\'al 
0  <  X  <  T.  We  have  seen  that  when  we  apply  to  the  sequence  of  the 
partial  sums  of  the  Fourier  sine  series  the  simple  process  of  constructing 
the  arithmetical  means,  and  then  repeat  this  process  several  times,  new 
trigonometric  approximation  curves  of  /(x)  arise,  which  are  all  positive 
and  convex  upward  on  the  interval  0  <  x  <  t.  This  theorem  reveals 
the  “smoothing  out  effect”  arising  from  the  repeated  operations  with 
the  arithmetical  means.  Other  theorems  of  this  character  exist,  but  I 
shall  not  go  into  them  at  this  time.  1  would  rather  turn  in  another 
direction  and  show  this  “smoothing  out  effect”  of  the  mean  operation 
in  an  entirely  different  manner. 

Let 

(39)  /(r)  *  c«  +  Cl  2  -|-  c*  **  +  ...  +  c»  2"  -h  ... 

be  the  power  series  development  of  a  regular  analytic  function  of  the 
complex  variable  2,  valid  for  |2|  <  1.  I  assume  at  the  outset  that  the 
function  /(r)  is  univalent  (schlicht)  in  the  interior  of  the  unit  circle 
I2I  <  1,  that  is,  if  2i  and  2s  are  two  distinct  arbitrary  values  in  the  unit 
circle,  then  /(21)  and  /(2s),  the  corresponding  values  of  f(z),  are  also 
distinct.  In  other  words, 

(40)  /(21)  =♦=  /(2s), 

when  #2i  4=  **;  |ril,  |rj|  <  1. 

I  now  ask  the  question:  Are  the  approximating  partial  sums  of  /(r), 
namely, 

(41)  S,‘*’(2)  “  «„(r)  =  Co  +  Ci2  +  Cs2*  +  .  .  .  +  C«2", 

n  “  2, 3, 4, 5,  . . . 

uni\’alent  or  not  univalent  in  the  interior  of  the  unit  circle  \z\  <  1. 

I  shall  present  two  examples. 

First  example. 

(42)  fit)  »  1+  2  +  2*  +  . . .  -I-  2-  -b  . . .  . 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS  13 

Here/(£)  is  regxilar  and  univalent  for  |z|  <  1.  Since 

(43)  -  1  +  22  -1-  3z»  +  . . .  +  nz— », 
dz 

all  the  roots  of  — lie  in  the  interior  of  the  unit  circle  when  n  ^  2. 
dz 

Hence,  the  partial  sums  of  this  geometric  progression  are  not  univalent 
for  |2|  <  1. 

Second  example.  As  a  second  example  I  select  the  important  series 
which  enters  in  the  numerical  calculation  of  logarithms, 

(44)  /(^)  =  ^log|-^  -  « +  ^  +  .... 

2  1  ““  2  o  o 


I 
I 
I 

Fio.  10  Fiu.  11 

Here  the  function  tr  =«  u  -}-  —  /(z)  is  again  univalent  for  |z|  <  1. 

It  maps  the  unit  circle  of  the  z-plane  into  a  strip  parallel  to  the  u-axis, 

having  the  width  In  this  case,  however,  all  the  partial  sums 

(45)  ,,(^)  -  ... 

3  2n  -  1 

n  *  1, 2, 3, . . . 

of  the  series  (44)  are  univalent  in  the  unit  circle,  including  the  boundary, 
that  is  to  say,  for  |f]  ^  1.  This  result  is  due  to  J.  W.  Alexander. 

I  prove  this  property  as  follows.  If  we  set 


(46) 


Snie**)  =  uie)  +  iv(ff) 


14 


LEOPOLD  FEJfiR 


then  u  and  v  take  the  form 


cos  {2v  —  l)d 


(47) 


V'  —  1)» 


Since,  as  is  well  known,  the  trigonometric  polynomial  v{$)  is  positive  for 
all  values  of  6  on  the  interval  0  <,  <  ir,  the  ordinates  of  the  image 

points  «,(«•*)  are  continually  positive  when  e**  describes  the  upper  half 
of  the  boundary  of  the  unit  circle  |2|  »  1  (in  the  positive  sense).  Since 
we  also  know  that 


(48)  2  sin  i2v  -  1)0  gO,  for  0  <  0  <  t, 

we  see  that  as  the  [xnnts  e‘*  describe  the  boundary  of  the  upper  half  of 
the  unit  circle,  the  abscissas  u(0)  of  the  image  points  «,(e")  are  mono- 
tonically  decreasing  for  0  <  0  <  ir.  From  this  it  follows  that  if  we  now 
take  into  account  the  symmetry  of  the  map  with  respect  to  the  u-axis, 
the  map  of  the  complete  circle  z  =  0  ^  0  <  2  ir,  formed  by  «»(e^) 

on  the  function  plane,  is  a  Jordan  cun’e.  Hence,  the  partial  sums 


(49) 


,»  -M-l 

+  - - 

3  2n  -  I 


are  univalent  for  |z  I  ^  1. 

From  these  two  examples  we  see  that,  while  a  function  /(z)  is  univa¬ 
lent  in  the  circle  of  convergence  of  its  power  series  development,  the  par¬ 
tial  sums  of  this  developihent  may  or  may  not  be  univalent. 

13.  Now,  Szego  has  proved  the  following  important  theorem : 

If  /(z)  is  regular  and  univalent  for  |z|  <  1,  then  all  the  partial  sums 
«,(z)  of  the  power  series  development  of  /(z)  are  univalent  for  the  circle 
|z|  <  J  (with  the  exception,  of  course,  of  the  case  *0(2)  *•  constant). 

This  theorem  is  not  true  for  a  value  of  the  radius  greater  than 

14.  If  we  consider,  with  Szego,  the  totality  of  the  regular  univalent 
functions  for  |z|  <  1,  we  find  that  we  can  add  nothing  new  to  this  last 
theorem.  I  will  therefore  confine  myself  to  a  sub-class  of  the  totality  of 
univalent  functions  /(z).  I  assume  first  that  in  the  development  of 
^  *  /(*)  the  coefficients  Co,  Ci,  . . . ,  c«,  ...  are  all  real,  that  is,  when  we 


J 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS  15 

map  the  inside  of  the  unit  circle  \t\  <  1,  by  means  of  the  function  w  « 
f{t),  the  image  in  the  tc  *  u  +  itvpiane  will  be  symmetrical  with  respect 
to  the  u-axis.  Secondly,  I  assume  that  the  images  of  the  circles  |s|  »  r, 
0  <  r  <  1,  arising  from  the  conformal  map  w  -■  f(z),  are  all  convex  in 
the  direction  of  the  v-axis,  that  is,  that  every  parallel  to  the  tvaxis  will 


Fio.  13 


have  in  common  with  the  image-curve  two  points,  one  point,  or  no 
point.  Such  a  situation  will  occur  when  the  imaginary  components  of 
have  a  fixed  sign  over  the  entire  upper  half  of  the  unit  circle.  If, 
in  particular,  the  domain  (finite  or  infinite)  in  which  the  unit  circle 
|s|  <  1  is  mapped  by  the  function /(«),  is  convex  in  the  usual  sense,  then 
our  condition  is  always  fulfilled. 


16 


LEOPOLD  FEJER 


For  the  univalent  functions  f(z)  which  fulfill  both  these  conditions, 
we  may  now  state  the  following  theorem : 

The  arithmetical  means  of  the  3rd  order  (r)  corresponding  to  the 
regular  and  univalent  power  scries 

(50)  f(z)  »  Co  +  Ci«  +  c»z*  +  . . .  +  CnZ’^  +  ... 


Fia.  15  Fia.  16 


for  \z\  <  1,  are  all  univalent  in  the  entire  unit  circle  \z\  <  1  (the  mean 
*0(2)  *  Co  always  excepted). 

15.  Particularly  interesting  is  the  case  where  f(z)  is  further  restricted 
in  such  a  way  that  Co  =  C|  =  c«  ■=  ...  *=  0,  that  is,  that  the  image  of 
the  unit  circle  is  symmetrical  both  with  respect  to  the  real  u-axis  and  to 
the  imaginary  e-axis. 


ARITHMETICAL  MEANS  OF  THE  PARTIAL  SUMS  . 


17 


The  development  of  f{z)  takes  the  form 
(51)  /(«)  ciz  +  Ctz*  +  +  ... 

In  this  case  all  the  arithmetical  means  of  the  2nd  order,  (x),  of 
the  power  series  are  univalent  in  the  entire  unit  circle  |z|  <  1. 

In  addition,  for  this  special  case,  I  can  give  a  statement  sharper  than 
the  Siego  theorem,  which  refers  to  the  original  partial  sums  of  the  power 
series  of  f(z).  Namely,  if  f(z)  is  regular  and  univalent  for  all  \z\  <  1, 
then  all  the  partial  sums  of  its  power  series  are  univalent  even  in  the 
circle 


(52) 

The  example 


1—2* 

which  is  also  instructive  in  other  respects,  shows  that  for  the  subclass  of 

univalent  functions  considered,  is  the  largest  value  for  which  the 

V3 

theorem  will  hold.  (Note  that  the  function - maps  the  unit  circle 

,  1—2* 

Izj  <  1,  onto  the  entire  w-plane,  cut  along  the  straight  lines  joining 
with  CD  i  and  —  with  —  ooj.) 

To  summarize  briefly:  In  the  first  part  of  my  lecture  I  have  shown 
how  we  may  arrive  at  convex  trigonometric  approximating  polynomials 
from  a  convex  function  by  the  use  of  simple  mean  operations  on  the 
Fourier  partial  sums.  In  the  second  part  I  have  shown  how,  from  the 
partial  sums  of  certain  univalent  functions  we  may,  by  simple  mean 
operations,  obtain  univalent  approximating  polynomials. 


A  GENERAL  SURVEY  OF  THE  THEORY  OF  ADIABATIC 
INVARIANTS 


Bt  Tuluo  L*vi-Civita‘ 

Sir  Arthur  hMdington,  in  the  opening  words  of  his  lecture  “On  the 
expanding  Universe”  held  here  last  year  has  very  kindly  quoted  my 
name,  taking  care  however  to  avoid  any  further  reference  “not  to 
trouble  you  with  mathematical  analysis.” 

This  time  you  are  truly  unfortunate.  Not  only  will  mathematics 
play  a  prominent  r61e,  but,  what  is  worse,  instead  of  hearing  the  eloquent 
and  fascinating  speech  of  F^ldington,  you  face  a  lecturer  who,  being  less 
than  middling  in  his  own  language,  will  be  particularly  ill  at  ease  in 
yours. 

I  essentially  rely  on  your  indulgence  being  encouraged  to  do  so  by  the 
great  benevolence  of  my  eminent  colleagues  and  of  the  staff  of  this  Uni¬ 
versity,  who  have  granted  to  me  a  great  honour,  not  without  taking  half 
the  responsibility  for  the  poor  delivery  of  my  lecture.  This  is  devoted 
to  a  subject,  which,  though  rather  recent,  had  received  during  the  past 
century  a  fundamental  pioneer  work  here  in  America,  by  J.  Willard 
Gibbs,  as  may  be  inferred  from  his  famous  “Principles  of  statistical 
mechanics”  (1902).  A  synthetical  hint  in  the  advanced  part  of  this 
book  (p.  157)  has  afterwards  led  Paul  Hertz*  to  the  discovery  of  the 
invariance  of  a  certain  volume,  which  dominates  all  the  so  called 
adiabatic  processes. 

In  a  few  minutes  I  shall  return  to  this  in  order  to  fix  precisely  defini¬ 
tions  and  assumptions.  For  the  present  I  wish  only  to  briefly  complete 
the  historical  sketch,  coming  to  the  period  in  which  adiabatic  invariants 
have  played  a  prominent  r61e  in  theoretic  physics. 

As  you  know,  the  great  achievement  of  first  explaining  theoretically 
spectroscopic  lines  was  performed  in  1913  by  N.  Bohr,  through  a  sugges¬ 
tive  mechanical  model  of  the  hydrogen  atom,  having  one  peculiarity  in 
regard  to  some  constants  of  integration  (e.g.  constants  of  energy,  of 
momentum,  of  angular  momentum).  In  the  classical  mechanics  such 

'  ProfeMor  of  Rational  .Vfechanirfl,  University  of  Rome,  Rome,  Italy — Ed. 

*  Annalen  der  Physik,  R.  33,  1910,  p.  548.  Compare  also  his  article  in  Webbr- 
Gans  “Reperlorium  der  Phyaik”,  erster  Rand,  zweiter  Teil,  Nr.  268. 

18 


/ 


THEORY  OF  ADIABATIC  INVARIANTS 


19 


constants  are  capable  of  assuming  continuously  all  values  (within  a 
certain  range);  in  Bohr’s  theory  on  the  contrary  only  a  discreet  set, 
namely  the  entire  multiples  of  a  certain  lowest  determination. 

By  systematic  use  of  this  criterion  and  masterly  skill  Sommerfeld 
succeeded  in  extending  the  mechanical  model  to  all  atoms  and  in 
deducing  numerical  results  in  wonderful  agreement  with  an  imposing 
congeries  of  facts.  It  was  even  believed  for  some  time  that  the  definitive 
theory  was  attained,  re<|uiring  perhaps  improvements,  but  no  revolu¬ 
tionary  changes.  Such  changes  came  indeed  very  soon,  but  Sommer- 
feld’s  acquisitions  are  still  remarkable  as  a  handy  and,  for  many  pur¬ 
poses,  quite  useful  approach  to  the  atomic  world. 

Among  these  acquisitions  from  the  standpoint  of  theoretical  me¬ 
chanics  the  introduction  and  the  further  development  of  adiabatic 
invariants  are  paramount.  The  name  and  the  first  acknowledgment 
of  their  general  interest  is  due  to  h^hrenfest.*  Here  we  are  at  the  very 
beginning  and  may  therefore  start  on  the  subject. 

1.  Ordinary  differential  systems  with  variable  parameters — Adiabatic 
invariants. 

Suppose  that  in  a  differential  system 

(1.1)  ^.X,(xllla)  (v-1,2 . JV) 

at 

the  second  members  Xt  are  given  functions  not  only  of  the  independent 
variable  t — we  shall  call  it  time — and  of  the  dependent  variables  Xi, 

Xt . Xh,  whose  set  is  denoted  simply  by  x,  but  also  of  a  certain 

number,  say  r,  of  parameters  a.  If  the  a’s  are  constants,  or  known 
functions  of  t,  the  equations  (1.1)  are  just  right  in  number  to  define  the 
evolution  of  the  x’s  from  a  given  state  (initial  values).  If  on  the  con¬ 
trary  the  a’s  vary  with  the  time  in  an  unknown  manner,  then  the  equa¬ 
tions  (1.1)  no  longer  suffice  for  the  task.  It  would  be  necessary  to  join 
to  them  r  further  relations. 

Sometime  however  it  is  possible  to  get  some  useful  information  about 
the  x’s  without  any  exact  knowledge  concerning  the  law  of  variation  of 
the  parameters  a.  This  happens  for  instance  if : 

1)  The  system  (1.1)  admits,  when  the  a’s  are  taken  rigorously  con¬ 
stant,  some  single-valued  integrals 

(1.2)  r.  (x|l|a)-c.  (a  »  1,2, . .  .,m). 

*  ''Adi<d)ati$cKe  Invarianten  und  Quantentkeorie" ,  Annalen  der  Physik,  B.  51, 
1916,  pp.  527-552;  and  "Adiabatic  invariants  and  the  theory  of  quanta",  Philo¬ 
sophical  Magasine,  vol.  XXXIII,  1917,  pp.  500-513. 


20  TULLIO  LEVI4:iVITA 

2)  Certain  not  too  restrictive  complementary  conditions  of  statistical 
character  are  fulfilled  (ergodic  behaviour,  which  will  be  later  on  duly 
emphasized). 

3)  The  variation  of  the  a’s,  though  unknown,  is  very  slow  (adiabatic). 

It  is  then  possible  to  show  that  some  functions  I(a,  c)  exist,  which  main¬ 
tain  their  value  unaltered  although  the  a's  vary  in  arbitrary  but  alow  manner. 

Functions  of  this  kind  have  been  first  recognized  by  Ehrenfest  and 
called  adiabatic  invariants.  He  was  led  to  their  introduction  by  ther¬ 
modynamical  analogies,  in  order  to  account  for  the  early  quantization- 
rule*  of  mechanical  systems  which,  as  remembered,  corresponds  to  the 
fact  that,  for  some  c«,  only  values  in  arithmetical  progression  are  allowed. 

2.  Macromechanical  illustration. 

It  will  be  useful  to  give  soon  some  particular,  very  elementary  ex¬ 
amples. 

Consider  for  instance  the  motion  of  a  free  particle  P.  The  equation 
in  vectorial  form  is 

(2.1)  m^  -  F, 

where  the  symbols  have  the  usual  meaning.  Let  us  suppose  that  some 
parameters  a,  whose  law  of  variation  is  partly  or  completely  unknown, 
enter  in  the  expression  of  the  force  F,  being  otherwise  granted  (by  the 
very  nature  of  this  force,  or  by  a  cause  whatever)  that  F,  as  a  vector 
applied  in  P,  always  encounters  a  fixed  straight  line:  say  the  z-axis. 
Then  the  motion  admits,  whatever  may  be  the  variation  of  the  a’s,  the 
integral  of  angular  momentum  round  the  axis  of  z;  i.e. 

xy  -  yx  ~  c. 

If  c  is  independent  of  the  a’s,  we  obviously  have,  according  to  the  definition 
of  section  1,  an  adiabatic  invariant.  In  this  particular  case  the  property 
holds,  without  any  further  restriction  about  the  manner  in  which  the 
a’s  vary. 

More  general  instances  of  this  simple  kind  of  adiabatic  invariants  are 
easily  offered  under  various  forms  by  ordinary  mechanics.  I  shall  only 
quote  a  pair  of  them: 

Firstly,  a  system  of  any  number  of  bodies,  exerting  each  other  forces 
of  any  kind,  but  not  acted  by  exterior  forces.  The  unknown  mecha¬ 
nism  of  interior  forces  plays  the  rdle  of  the  parameters  a.  Then  the 
motion  takes  place  under  rigorous  conservation  of  two  vectors:  the 

*  Besides  SouiURrELD,  st  least  Debte,  Epstein,  Schwarzbchilo  must  be 
mentioned  in  this  regard. 


THEORY  OF  ADIABATIC  INVARIANTS 


21 


resultant  Q  of  momenta  and  the  resultant  K  of  angular  momenta  (of  the 
bodies)  with  respect  to  a  fixed  point  or  to  the  center  of  inertia.  These 
vectors  and  therefore  each  of  their  six  components  along  fixed  axes  are 
then  adiabatic  invariants. 

A  more  supple  scheme  leading  to  similar  conclusions  is  the  following: 
Suppose  that  the  motion  of  a  material  system  of  any  degree  of  freedom 
is  governed  by  Hamiltonian  equations,  the  characteristic  function  of 
which 

W(p1,|o) 

depends,  not  only  on  coiirdinates  and  moments  p,  q,  but  also  on  variable 
parameters  o,  entering  in  H  in  very  general  manner.  If  especially  the 
mechanical  feature  of  the  system  implies  that  //  does  not  involve  some 
of  the  q,  say  q,,  then  from 


it  follows 

Pm  “  const., 

and  this  is  not  only  an  integral  belonging  to  our  system,  but  an  adiabatic 
invariant  too.  Even  in  this  case  ^‘adiabatic”  figures  nominally  from  the 
definition  of  the  previous  number,  but  nothing  is  required  about  the 
slowness  of  variation  of  the  parameters  a.  Later  on  we  shall  consider 
cases  of  a  very  different  kind,  where  on  the  contrary  the  slow  variation 
is  dominating. 

3.  Geometric  language — Liouville's  systems. 

With  obvious  extension  to  any  number  of  dimensions  of  current 
geometric  and  kinematical  notions  we  shall  interpret  x\,  X},  . . . ,  xv  as 
cartesian  coordinates  of  a  point  P  in  an  euclidean  Sm;  X,,  given  func¬ 
tions  of  P  and  !(»'«■  1,  2,  . . .,  iVOi  components  of  a  vector  v  (P,  t). 
Then  equations  (1.1)  are  summarized  by 

(3.1)  ^ 

at 

and  the  theorem  of  existence  takes  the  intuitive  form  of  determination 
of  a  motion  P(t)  from  initial  position  Po  and  law  of  velocity  v(P,  I)  as  a 
function  of  position  and  time. 

XIoreover  we  may  consider  a  set  of  initial  positions  Po  filling  up  an 
iV-dimensicnal  continuum  Co  of  *S:  and  at  any  t  [within  the  domain  of 


22 


TULLIO  LEVI-CIVITA 


analytical  ref^larity  of  (1.1)1,  tlic  of  corresponding  P,  which  will  fill 
up  a  new  domain  C. 

We  obtain  in  this  way  the  general  image  of  the  motion  of  a  continuous 
system,  especially  of  a  fluid  in  Sv. 

Among  the  differential  systems  (1.1)  or  corresponding  motions,  a  very 
ample  and  important  class  is  formed  by  the  so  called  LioiiviUe’s  ayatems. 
They  are  submitted  to  the  sole  condition  that  the  divergence  of  the 
vector-field  v  is  lero,  that  is 

(3 . 2)  div  V  »  *  0 

This  condition  ensures  Constance  of  extension — we  shall  often  say 
simply  volume — in  <S.v;  that  is  the  equality  of  the  euclidean  volume  of 
any  C  to  the  volume  of  its  initial  determination  Co. 

In  the  real  hydnxlynamical  case  (N  «  3)  this  is  the  condition  of  in¬ 
compressibility  which  usually  belongs  to  liquids. 

From  the  purely  mathematical  standpoint  (3.2)  implies  for  the  system 
(1.1)  the  property  of  admitting  as  an  integral  invariant  the  extension 

(3.3)  r-  jdSs 
where 


(3.4)  dSif  «  dxi.  dxt  . . .  dxn. 

4.  Volume  V  of  a  thin  stratum  sticking  to  a  closed  surface — Corre¬ 
sponding  variation  of 
Let 

(4.1)  F(xi,  xi,  ...,xh)~E 

I 

represent,  for  a  constant  value  of  E  (in  a  given  interval)  a  closed  surface 
a  of  Ss’.  we  say  “surface”  instead  of  “hypersurface”  to  emphasize  the 
analogy  with  the  ordinary  space  JV  =  3. 

5F 

We  shall  suppose  that  the  partial  derivatives —  of  F  do  not  vanish 

all  together  at  a  point  of  a  (which  excludes  the  existence  of  multiple 
points) ;  then  the  gradient 


THEORY  OF  ADIABATIC  INVARIANTd 


23 


is  always  >  0,  or  even  >  than  a  positive  constant  while  o  admits 
in  every  point  P  a  well  determined  normal. 

Let  now  the  closed  surface  a  be  varied,  by  giving  to  every  point  P  an 
infinitesimal  displacement  PQ  in  along  the  normal,  bn  being  com¬ 
puted  positively  outwards  (and  negatively  inwards). 

Then  a  familiar  formula  for  the  increment  fiV  of  the  volume  V  (en¬ 
closed  initially  by  a,  and  afterwards  by  the  varied  surface,  locus  of  the 
points  Q)  is 

(4.3)  iV  =  f  (iff.  in, 


dff  representing  an  element  of  the  N-1  fold  extension  subordinated  on 
o  by  the  euclidean  metrics  of  Sm. 

Instead  of  in  it  is  convenient  to  put  in  evidence  the  corresponding 
variation  E,  which  by  (4.1)  arises  from  the  passage  from  P  to  Q.  To 
this  purpose  observe  first  that  the  direction  cosines  a,  of  the  normal  to 
o  ',  in  a  generic  point  P  are 


(4.4) 


_  1  dP 
“  G  dx,  • 


In  passing  from  P  to  any  very  near  point  of  the  surrounding  space,  let 
ix,  and  iF  be  the  increments  of  the  coordinates  x,  and  of  their  function 
F.  From  the  ordinary  rule  of  calculus 


dP 


ix„ 


while,  changing  beforehand  if  necessary  the  sign  of  F  and  of  E  in  the 
formula  (4.1),  it  is  always  permitted  to  suppose  that  the  sense  of  the 
normal  direction  is  outwards. 

Now  if  ix,  are  in  particular  the  components  of  the  infinitesimal 
vector  PQ,  its  resolved  part  in  in  the  sense  of  the  normal  is 


jr 


1 


Therefore,  on  account  of  (4.1)  and  (4.3), 

an  -  1  3P  *  1 6E, 

G  G 


(4.5) 


24 


TULLIO  LEVI^IVITA 


and  (4.4)  becomes 


J.  G 


which  is  the  required  formula. 

5.  One  tingle-valued  integral  F  ^  E  —  Correlation  between  E  and  o’s 
by  mean-values  in  time. 

We  come  to  the  main  point  of  our  matter,  taking  into  consideration 
Liouville’s  systems  (1.1),  which  admit,  for  constant  values  of  the  para¬ 
meters  a,  one  and  only  one  single-valued  integral 


As  in  the  preceeding  section,  we  shall  suppose  that  the  surfaces  (4.1) 
of  S>f  are  closed,  within  the  range  of  values  of  E  and  of  the  a’s  to  be 
looked  at.  Assuming  determined  values  of  the  a’s,  we  address  firstly 
our  attention  to  a  particular  motion  of  our  system,  which  determines  the 
value  of  as  well  as  the  path  y  (curve  on  a);  and  fix  furthermore  a 
particular  instant  t.  The  moving  point  will  then  occupy  a  well  defined 
position  P.  If  we  suppose  given,  just  at  t,  to  the  a’s  infinitesimal  in¬ 
crements  da,  and  put 


1 


the  constant  of  integration  E  of  (5.1)  a.S8umes  accordingly  the  infinitesi¬ 
mal  variation 


(^F\ 

i.e.  in  the  —  )  the  x’s  are  simply  the  coordinates  of  P. 

da,/ 

Suppose  on  the  contrary  that  the  variation  of  the  a’s  is  so  slow  that 
only  an  infinitesimal  increase  da  takes  place  in  a  very  long  interval  of 
time  T  comprising,  say  at  the  middle,  the  instant  t.  Then  the  j’s  are  no 
longer  constant;  but  functions  of  t,  corresponding  by  (1.1)  to  the  par¬ 
ticular  solution  allude<l  to.  With  this  view  we  arc  naturally  (that  is  by 
very  intuitive  postulates)  Ie<l  to  assume  as  the  variation  induced  in  E, 
during  the  aforesaid  time,  the  mean  value  d„F,  that  is 


A 


THEORY  OF  ADIABATIC  INVARIANTS 


25 


where 

(6.5) 


d,F 


j  r»+r/j 


d,F.cU 


We  have  here  a  very  reasonable  expression  for  dE,  inferretl  from 
adiabatic  variation  along  a  ■particular  path  y  and  for  a  limited  interval 
of  time  T.  But  physically  it  appears  even  more  convenient  to  avoid  as 
possible  the  peculiarities  which  may  arise  from  an  individual  path,  and 
therefore  to  substitute  for  the  geometrical  line  y  a  thin  tube  T,  made  up 
by  paths  around  y.  We  intend  exactly  a  tube  of  N~\  dimensions 
arising  from  a  small  ^-2-fold  variety  ro  of  a  around  the  initial  position 
Po  of  the  point  P  moving  on  a.  The  points,  initially  on  ro  will  occupy 
at  a  generic  instant  t  a  well  determined  Ar-2-fold  region  r  of  a  round  P. 
The  tube  F  is  the  locus  of  the  r’s  for  varying  t.  We  shall  on  the  othbr 
hand  speak  of  r  as  the  section  of  F  at  the  instant  t.  It  will  be  suitable 
for  clearness  to  denote  by  Fr  the  portion  of  the  tube  F,  described  in  the 
interval  t  —  T/2 — I  +  T/2,  and  also  its  AT-l-dimentional  extension  over 
the  surface  o  of  our  euclidean  <Sjv. 

Thus  we  are  led  to  replace  in  (5 . 4)  d,F,  where  the  bar  alludes  to  time- 
mean,  by  the  iV-l-fold  mean 

of  (5.4), 

(5.6)  ZE~:^f  dr. 

TrJrT 


Obviously,  if  we  were  able  to  show  that  the  differential  relation  (5.6) 
tends,  as  T  — »  <x> ,  towards  a  limit  independent  both  of  t,  y  and  of  the 
tube  F,  we  would  be  entitled  to  regard  the  result  as  a  fundamental 
property  of  the  differential  system  (1.1)  as  a  whole,  connecting  adia¬ 
batic  infinitesimal  variation  of  the  parameters  a  with  the  variation 
induced  in  the  constant  of  integration  E. 

6.  Digression  on  the  ergodic  hypothesis —  IrUuilive  treatmerU. 

Referring  to  our  Liouville’s  system  (1.1)  with  the  only  uniform  inte¬ 
gral  F  ^  E,  the  existence  of  an  unique  limit  for  the  averaging  process 
just  described  rests  essentially  on  the  circumstance,  ascertained  in  the 
most  simple  cases,  that  every  (or  almost  every)  solution  y  will  practi¬ 
cally  fill  up  the  whole  surface  o,  in  the  sense  that  it  will  pass  infinitely 
near  to  every  point  P  of  a. 


r 

h 


TULLIO  LEVI-CIVITA 


Indeed  thin  is  not  always  the  case,  but  holds  only  in  general,  some 
exceptional,  not  substantially  troubling  occurrences,  being  duly  ex¬ 
cluded.  On  the  contrary  the  apparently  more  complicated  iV-l-fold 
average  (5.6)  is  more  suited  to  the  statistical  character  of  these  ques¬ 
tions.  Under  the  ergodic  hypolhenitt  (to  be  specified  in  the  next  number) 
it  has  been  shown  that  the  second  member  of  (5.6)  possesses  always,  for 
r  — »  « ,  a  well  defined  limit,  independent  as  well  of  t  and  y,  as  also  of  F. 

This  being  granted,  a  further,  from  intuitive  standpoint  very  accept¬ 
able  step  consists  in  the  calculation  of  a  certain  integral,  unembarrassed 
by  critical  reserves  on  the  legitimacy  of  operational  passages.  The  inte¬ 
gral  alluded  to  has  the  fonn 


where  /  is  a  generic  point-function  in  Sm  (or  at  least  on  a) ;  and  Fr  the 
portion  of  the  thin  tube  F,  defined  in  the  preceding  section.  We  may 
think  Fr  divided  in  slices  by  the  sections  r  and  r'  which  corresponds  to 
any  neighbouring  instants  t  and  t  +  dt.  If  dr  is  an  element  of  the  section 
r,  surrounding  a  generic  point  P,  and  w  the  resolved  part  of  the  veloc¬ 
ity  (3.1)  of  P  along  the  normal  to  dr  (lying  in  a),  we  have  obviously 


1  f 

—  /  dt  fwdr. 

rrJt-T/J  J, 


The  integral  I  may  be  transformed  in  Lebesgue’s  manner  as  follows. 
Let  us  fix  a  point  P  on  a  and  an  elementary  portion  do  around  P:  for 
the  various  dt  in  which  the  thin  tube  Fr  has  an  infinitesimal  portion 
wdldr  within  do,  the  function  under  the  integral  sign  may  be  written 
f{P)u){P)dtdT.  The  total  contribution  afforded  to  our  integral  by  do 
is  accordingly 

f{P)w{P)^dldT, 


w(P)ZdtdT  being  a  sort  of  weighted  length  of  time  during  which  our 
thin  tube  F  on  a  finds  itself  within  do.  Putting 


—  w{P)ZcUdT  -  Krd<r 
Fr 


and  admitting  (what  is  again  perfectly  plain  in  physical  reasoning,  but 
rather  delicate  from  the  mathematical  standpoint)  that  the  density  Kr 
exists  and  is  an  integrable  function  on  o,  our  mean-time  integral  takes 
the  form  of  a  sjiace,  or  rather  surface,  integral 


THEORY  OF  ADIABATIC  INVARIANTS 


27 


(6.2)  7  -  ^/(P)K/.<f<r. 

It  is  to  be  noted  that,  for  T— »  «,  the  definition  (6.1)  of  K/.  da  is,  by  its 
very  nature,  invariant  under  (1 . 1),  in  the  following  sense.  Let  us  dis¬ 
place  P  and  the  surrounding  da,  as  well  as  all  the  elements  of  the  tube  F, 
along  the  respective  paths,  for  an  amount  corresponding  to  an  arbitrary, 
common  interval  of  time  t'.  Then  the  relation  of  belongings  (the  sum 
of  the  portions  of  the  tube  F,  which  lie  within  da)  is  the  same,  before  and 
after  the  displacement.  Therefore  the  left-hand  side  of  (6. 1),  and  Kpda 
with  it,  preserves  its  value.  This  just  proves  the  invariance  of  Kda 
along  any  path. 

In  other  words  the  density  K  is,  for  Liouville’s  systems  (1.1),  an 
integral  invariant  on  a. 

Now,  from  the  volume  invariance,  it  follows  that,  with  respect  to  (1 . 1), 
da.dn,  and  therefore  by  (4.5)  also 


is  invariant.  If  dE  is  supposed  initially  constant  over  a,  the  stratum 
limited  by  the  two  surfaces  F  ^  E,  F  E  dE  ia  invariant  under  (1.1), 

that  is  dE  is  also  constant  in  time,  and  therefore  —  is  invariant  for  itself. 

G 

It  follows  then  from  the  theory  of  integral  invariants  (however,  under 
the  further  admission  that  K  is  endowed  with  continuous  derivatives) 
that  KO  must  be  on  a  an  ordinary  integral  of  {1 .1 ).  As  we  have  supposed 
that  the  only  single-valued  integral  ia  F  ^  E,  we  conclude  that  K(?  is  a 
mere  function  of  F;  that  is  (F  being  constant  on  a)  K  has  necessarily  the 
form 


constant 

G  ‘ 

7.  A  glance  at  recent  mathematical  advances. 

We  shall  in  a  moment  make  essential  use  of  this  result;  but,  before 
closing  the  digression,  mention  must  be  made,  at  least  hastily,  of  the 
mathematical  work,  which  has  recently  culminated  in  the  ergodic 
theorem  of  Birkhoff. 

Poincar^  had  first  established,  for  Liouville’s  systems,  the  general 
property  of  recurrence,  showing  that  exceptions  have  zero  probability 
of  occurrence. 


28 


TULLIO  LEVI4:iVITA 


Carath4odory  gave,  twenty  years  later  (1914),  the  exact  mathematical 
formulation  of  this  statement,  proving  that  the  exceptions  form  a  set 
of  measure  zero,  a  line  of  thought  prosecuted  thence  by  Birkhoff  in 
several  directions. 

We  must  conhne  ourselves  to  his  introduction  of  metrical  transitivity, 
which  replaces,  for  the  cases  considered  here,  the  ergodic  hypothesis,  that 
is  the  practical  hlling  up  of  the  surface  o  by  a  general  path;  and  consists 
in  the  following  mathematical  requirement:  the  surface  a  may  not  be 
decomposed  into  two  subsets,  each  of  measure  greater  than  zem,  and 
each  invariant  under  (1.1). 

Using  a  deep  connection,  first  observed  by  Koopman,*  between 
dynamics  and  groups  of  transformations  in  Hilbert  space,  professor  v. 
Neumann*  succeeded  in  giving  a  rigorous  proof  of  the  mean  ergodic 
theorem,  from  metrical  transitivity  os  a  premise.  The  proof  was 
shortly  after  simplified  by  E.  Hopf,^  avoiding  spectral  decomposition, 
while  professor  Birkhoff*  by  different  methods  was  able  to  attain  even 
the  ergodic  theorem,  showing  that,  under  the  same  condition,  every  path 
7  (excepting  at  most  a  set  of  measure  zero)  admits  the  same  asymptotic 
time-mean. 

8.  Gibbs-Hertz  theorem. 

The  mean  ergodic  theorem  enables  us  at  once  to  give  a  very  dehnite 
and  handy  form  to  the  fundamental  equation  (5.6)  expressing  the  ele¬ 
mentary  change  doE  as  function  of  the  adiabatic  parameters  a  and  their 
differentials  da.  According  to  section  6,  we  need  only  to  replace  the 
somewhat  complicated  operation,  figuring  in  the  second  member,  by  a 
local  mean  extending  over  a,  with  a  density,  or  function  of  distribution, 
proportional  to  1/0.  This  gives  immediately 

(8.1)  d.E^  \  % 

I  J  0  (j  J  0  \J 

*  O.  Koopman,  Hatnillonian  systems  and  Hilbert  Space,  Proc.  of  the  Nat.  Ac. 
17  (May,  1931),  pp.  31&-318. 

*  J.  V.  Neumann,  Proof  of  the  quasi-ergodic  hypothesis,  ib.,  18  (January,  1932), 
pp.  70-82. 

^  E.  Hope,  On  the  time  average  theorem  in  dynamics,  ib.,  pp.  93-100. 

*  G.  D.  Birkhopp,  Proof  of  a  recurrence  theorem  for  strongly  transitive  systems 
and  proof  of  the  ergodic  theorem,  ib.,  17  (December,  1931),  pp.  650-660.  See  also: 

A.  WiNTNER,  Remark  on  the  ergodic  theorem  of  Birkhoff,  ib.,  18  (March,  1932), 
pp.  248-251. 

G.  D.  Bireopp  and  B.  O.  Koopman,  Recent  contributions  to  the  ergodic  theorem, 
ib.  1,  pp.  279-282. 

G.  D.  Birkhopp,  Probability  and  physicid  systems.  Bull,  of  the  Am.  Math.  Soc., 
vol.  XXXVIII,  1932,  pp.  361-379. 


J 


THEORY  OF  ADIABATIC  INVARIANTS 


29 


which  may  be  written 

(8.2) 


Now  doE  —  dJF  has  a  very  simple  meaning,  in  relation  to  the  behav¬ 
iour  of  the  surface 


F{x\a)  .  E, 

in  the  adiabatic  process.  The  a’s  being  incremented  by  da  and  E  by 
dJE,  the  equation  becomes 

(8.3)  F  -  £  +  bE, 
where 

(8.4)  bE  -  d'aE  -  doF. 

We  pass  therefore  from  o  to  a  nearby  surface  a',  which  may  be  deduced 
point  by  point  from  o,  through  a  normal  displacement  bn.  From  a 
known  formula,  recalled  in  section  4,  this  normal  displacement  is  defined 
in  terms  of  bE  by  bE/G.  Hence  the  first  member  of  (8.2)  is  just 

•  jbE^~  j bn. dc 

i.e.  the  total  variation  daV,  produced  by  the  (infinitesimal)  adiabatic 
process  on  the  volume  V  included  by  c. 

The  relation  (8.2)  between  daE  and  the  da’s  is  therefore  equivalent  to 

(8.5)  d,V  -  0, 

or  F  const.,  along  the  adiabatic  process,  i.e. 

V  —  adiabatic  invariant. 

This  is  the  wonderful  result,  the  elements  of  which  Gibbs  had  prepared 
in  his  statistical  mechanics  in  the  most  accurate  and  apparently  con¬ 
scious  way;  the  explicit  statement  is  however  a  distinguished  merit  of 
Paul  Hertz.  The  very  striking  feature  of  the  deduction  is  that  the 
total  differential  equation  between  E  and  the  a’s,  deduced  from  a  general 
statistical  argument,  happens  to  fulfill  automatically  the  condition  of 
complete  integrability,  and  even  more  to  admit  just  the  integral  V  ^ 
const. 

9.  Elementary  mechanical  applications. 

The  developments  of  the  preceding  numbers  find  capital  application 


30 


TULLIO  LEVI4:;iVITA 


to  canonical  systems,  especially  to  those  of  dynamics,  to  which  they 
pn)perly  owe  their  rise. 

Indeed  any  canonical  system 


(9.1) 


^1} 

dl 


dP*’  dt  “  dP* 


(A  *  1,2 . n), 


where  II (pl^la)  depends  upon  the  2n  arguments  p,  q  and  some  param¬ 
eters  a,  is  but  a  particular  case  of  a  Liouville’s  system,  for  which  AT  — 
2n.  If  II(p\q\a)  does  not  involve  t,  the  generalized  energy  integral 

(9.2)  II  -  E 

is  admitted. 

Suppose  on  the  other  hand  that  no  other  single-valued  integral 
exists,  while  the  surfaces  II  *  E  in,  the  phase-space  (where  p,  q  are 
regarde<l  as  cartesian  coordinates)  are  closed,  and  metrical  transitivity, 
i.e.  ergodic  behaviour  is  granted;  then  the  preceding  theorem  is  applica¬ 
ble,  and  we  may  state  that  the  volume  enclosed  by  isoenergic  manifolds  is 
an  adiabatic  invariant. 

Simple  and  valuable  illustrations  of  this  are  offered  by  canonical 
systems  with  one  degree  of  freedom  (and  any  number  of  variable  param¬ 
eters).  The  phase-space  is  then  the  cartesian  plane  p,  q  {q  abscissa 
and  p  ordinate)  and  the,  by  hypothesis,  closed  surfaces  H  E  reduces 
to  the  closed  curves 


IIip\q\a)  «  E 

in  the  plane.  In  the  conservative  problems  of  ordinary  mechanics  H 
has  the  particular  form 

(9  3)  ,  jjP’-t'. 

where  A  and  U  (force-function)  depend  only  of  q  (and  of  adiabatic 
parameters),  and  every  curve  a  (//  —  E)  is  symmetrical  with  respect 
to  the  9-axis.  If  a  has  no  double  points  the  motion  on  it 

p  -  p(0,  q  -  9(0. 

defined  by  the  canonical  system,  is  necessarily  periodic,  as  well  known 
from  Weierstrass,  and  more  simply  to  be  proved  by  kinematical  reason¬ 
ing  on  0  itself,  as  Lampariello  has  recently  remarked.* 

*  Sulla  quadratura  rhe  effeltua  V integrasione  dei  aislemi  eanonici  con  un  grado  di 
libtrlA,  Rend.  Acc.  Lineei,  Vol.  XVII,  1933,  pp.  74-88. 


THEORY  OF  ADIABATIC  INVARIANTS  31 


The  double  fact  that  the  isoenergic  manifold  a  is  in  this  case  a  mere 
curve  and  that  the  motion  is  periodic  would  allow  to  establish  the 
adiabatic  property  of  the  area  within  a  by  elementary  means,  without 
any  ergodic  or  analogous  theorem,  because  its  validity  is  in  this  case 
directly  apparent. 

10.  Examples. 

Oscillalor.  The  corresponding  //  may  be  taken  under  the  form 

ip*  +  i«V, 

w  being  the  adiabatic  parameter. 

The  area  within  the  ellipse  H  ^  Eve  obviously 

(10.1)  . 


Hence  E/u  is  the  adiabatic  invariant  for  the  oscillator.  In  the  hrst 
quantized  model  of  radiation,  invented  by  Planck,  a  great  deal  of  oscilla¬ 
tors  were  supposed,  each  having  the  same  value  h  for  E/m.  If  not  this 
very  value,  at  least  the  Constance  of  E/u  in  thermodynamical  processes 
is  justified  by  its  adiabatic  invariant  character. 

Oscillating  pendulum.  If  I  denotes  the  length  of  the  pendulum,  g  the 
terrestrial  gravity  and  29o  the  amplitude  of  each  oscillation,  we  have  for 
the  period  7  of  a  complete  oscillation  the  familiar  formula 

(10.2)  T  -  4  V -^(ifc)  -  2tV/-  ^  c,***", 

^  g  ^  g  V* 


where  k  »  sin|do)  K  denotes  the  complete  elliptic  integral  of  the  first 
kind,  and  the  numerical  coefficients  c.  of  its  development  are 


Co 


If  Cn 


1.3.5 . (2n  -  1) 

2.4.6 . 2n 


From  the  definition  of  V  (area  included  by  an  isoenergic  curve  in  the 
phase-plane)  we  get  (omitting  for  brevity  the  formal  calculation) 

(10. 3)  V  ~  -  AirgH*  2,  c,»A:*-+». 

The  total  energy  E,  if  we  dispose  of  the  additional  constant  as  to 
render  £  »  0  for  the  state  of  stable  equilibrium,  has  the  value 

(10.4)  E  -  2glk*. 


32 


TULLIO  LEVI^IVITA 


Hence  putting  in  evidence  in  (10.3)  the  factor  2itE/w  »  ET,  and  using 
(10.2)  and  (10.4),  we  may  write 


(10.5)  V 

By  the  mean  theorem, 


2rrE 


2  Kkdk 


Kk* 


C' 


Kdk*  -  ..it* 


where  K  refers  to  some  value  of  the  argument  intermediate  between  0 
and  k.  Hence 

„  2irEK 


w  K’ 


the  second  factor  being  always  less  than  unity  for  it  >  0,  because  K 
increases  with  k  from  ^  to  infinity.  The  infinity  being  merely  loga¬ 
rithmic,  the  rate  K/K  ranges  from  1  for  A  «  0  to  0  for  A:  —  ^ . 

Problem  of  two  bodiet  {variable  maaaes).  With  the  aid  of  classical 
integrals  of  linear  and  angular  momenta,  the  problem  may  be  reduced 
to  one  degree  of  freedom.  Denoting  by  r  the  mutual  distance  of  the 
two  particles  and  by  r  their  (relative)  radial  velocity  (to  be  regarded  as 
conjugate  to  r),  the  reduced  Hamiltonian  has  the  form 


(10.6) 


H 


2  2r>  r 


where  M  is  the  sum  of  the  masses,  /  the  gravitational  constant,  c  the  con¬ 
stant  angular  momentum. 

If  we  suppose  that  the  masses  may  vary,  the  problem  is  still  regulated 
by  the  canonical  system** 


dt  dr  *  dt  dr 


but  it  is  no  longer  resoluble  by  quadratures.  At  any  rate,  excluding 
external  forces,  the  angular  momentum  c  is  rigorously  constant,  not  a 

**  This  is  the  current  assumption  to  which  we  here  adhere.  In  some  cases  how¬ 
ever  it  is  no  longer  true.  See  a  note  of  the  Author  in  the  Rend.  Acc.  Lincei  (Vol. 
XI,  1930,  pp.  626-632). 


THEORY  OF  ADIABATIC  INV’ARIANTS 


33 


quantity  slowly  varying  as  M.  As  is  well  known,  the  curves  of  the 
phase-plane  (r,  r) 

H  ~  E 

are  closed  if  and  only  if  £  <  0.  Then,  putting 


a  represents  the  semi-axis,  by  constant  M,  of  the  elliptic  (relative) 
trajectory,  by  variable  M,  of  the  osculating  one. 

In  the  same  sense 

(10.7)  c»  -  /A/a(l  -  O, 

e  denoting  the  eccentricity. 

A  simple  calculation  of  the  area  included  by  //  —  £  leads  to  the  con¬ 
clusion"  that  fMa  is  an  adiabatic  invariant:  (10.7)  shows  then  that  the 
eccentricity  e  too  is  an  analogous  invariant.  These  conclusions  are 
obviously  legitimate  under  provision  of  all  premises  being  satisfied 
(compare  e.g.  section  1)  upon  which  our  reasoning  rests.  An  essential 
requirement  is  that  the  curves  H  ^  E  should  be  closed,  or  E  <  0.  If 
the  variation  of  M,  though  slow  at  will,  causes  at  length  the  instanta¬ 
neous  ellipse  to  become  a  parabola  and  then  an  hyperbola,  we  certainly 
are  out  of  the  limits  of  adiabatic  invariance.  My  colleague  Armellini" 
has  given  a  very  interesting  example  of  this  behaviour,  supposing  that 
M  decreases  from  unity  according  to  the  formula 


where  <  is  a  very  small  positive  constant.  In  this  case  the  equations  of 
motion  may  be  easily  integrated  and  gives  in  particular 

e  =»  «  (1  -f-  €<). 

The  eccentricity,  instead  of  remaining  adiabatically  invariant  would 
tend  to  infinity. 

Symmetrical  top  turning  about  a  fixed  point  0,  under  no  external  forces. 

Applicazioni  (utronomiche  degli  invarianti  adiabatici,  Atti  del  Congreeso 
Int.  dei  Matematici,  T.  V.  Bologna,  19.31,  pp.  17-28. 

'*  Sopra  I’incremenlo  deW eccentricity  net  problema  dei  due  corpi  di  masse  decre- 
scenti,  con  applicazioni  cdle  orbite  delle  stelle  binarie,  Rend.  Acc.  Linrei,  Vol.  XV,* 
1932,  pp.  702-706. 


34 


TULLIO  LEVI-CIVITA 


The  discusRion  of  this  case  by  Krall‘*  has  revealed  the  curious  feature 
that,  if  the  vector  K,  resultant  angular  momentum,  is  to  be  treated 
(section  2)  as  rigorously  constant,  then  no  further  adiabatic  invariant  is 
admitted.  Otherwise  the  expression  of  the  adiabatic  invariant  seems 
to  bear  a  trace  of  the  (initial)  orientation  of  both  K  and  the  gyroscopic 
axis  with  respect  to  some  fixed  direction  (fixed  means  in  this  case  also 
not  exposed  to  adiabatic  influences). 

11.  Advanced  theory — Adiabatic  invariants  deduced  from  involutory 
integrals —  Burgers  theorem  as  a  corollary. 

The  preceding  examples  all  refer  to  problems,  which  have  properly 
one  degree  of  freedom,  or  may  be  reduced  to  this  case  by  the  use  of  inte¬ 
grals  of  the  elementary  canonical  form  p  —  const. 

Now  take  a  general  Hamiltonian  H(p\q)  with  n  degrees  of  freedom 
and  possibly  involving  adiabatic  parameters  a.  Suppose  that,  for  con¬ 
stant  a’s,  the  corresponding  canonical  system 


(11.1) 


dt  dqk  dt  i>qk 


{h  =  1,2,  .  ..,n) 


admits  the  energy  integral 

Hip\q)  -  E, 

as  also  m  ( <  n)  and  no  other  single-valued  integrals 
^-(P  1 9)  *  c.  (a  *  1,2, . .  . ,  m) 

not  involving  t,  and  in  involution  among  themselves.  For  the  sake  of 
symmetry  it  will  be  convenient  to  introduce  also  the  notations 

H{p\q)  -  Foip\q),  E  -  Co, 

so  that  the  m  -{-  1  known  int^rals  all  come  under  the  above  scheme,  a 
ranging  from  0  to  m: 

(11.2)  F. -c.  (a  -  0,1,2,  ...,m). 

It  has  been  shown  by  the  author**  that  m  -{■  1  adiabatic  invariants 
exist  and  may  be  constructed  by  mere  quadratures,  under  only  qualitative 
conditions:  closure  of  each  surface  F.  —  c«  in  phase-space;  metrical 
transitivity  duly  intended,  namely  extending  over  (2n-2m-l)  dimen- 


L’incariante  adiabalico  net  moto  libero  dei  giroscopi,  Rend.  Acc.  Lincei,  Vol. 
XIV,  1931  ;pp.  179-184. 

Drex  VorUtungen  xiber  adiabatische  Invarianlen,  Abh.  aus  dem  Math.  Seminar 
der  Hamburxisrhen  UniveraiUU,  B.  V'l,  1928,  pp.  323-366. 


THEORY  OF  ADIABATIC  INVARIANTS 


35 


sions,  not  over  the  entire  manifold  (11.2),  which  has  m  dimensions 
more,  i.e.  2n-m-l. 

This  result  appears  as  a  rather  elaborate  combination  of  Gibbs-Herts 
theorem  with  Lie's  theory  of  reduction  of  canonical  systems  by  means  of 
known  integrals,  as  realized  in  the  illuminating  treatment  of  Morera. 
Perhaps  the  most  interesting  application  hitherto  made  of  this  general 
result  regards  the  case  of  Stackel  in  which  the  system  (11.1)  (for  con¬ 
stant  a’s)  is  integrable  by  separation  of  variables.  In  this  case  Stackel 
himself  had  recognized,  by  a  remarkable  extension  of  a  result  of  Weier- 
strass,  that  all  motions  are  multiperiodic  (german,  bedingt-periodisch), 
and  Burgers'*  had  succeeded,  resting  on  this  circumstance  in  a  very 
ingenious  way,  to  demonstrate  the  adiabatic  invariance  of  the  single 
integrals 

Wk  -=  (fpkdgk  (h  »  1,2,  . . . ,  n). 


extended  each  over  the  individual  period  Tk  admitted  by  the  coordinate 
qk  (and  its  moment  p*).  Exceptional  cases  however  are  let  aside  by  this 
method:  the  so  called  degenerate  motions  for  which  at  least  two  of  the 
partial  periods  Tk  are  commensurable,  i.e.  have  a  rational  rate.  Our 
treatment,  which  ignores  ihe  analytical  feature  of  the  solutions  and 
needs  only  a  metrical  transitivity  of  a  lower  rank,  does  not  meet  such 
difficulty.  It  shows,  as  to  be  foreseen  from  the  very  nature  of  things, 
that  the  adiabatic  invariance  of  the  IF*  is  not  influenced  by  a  mere 
arithmetical  character,  as  commensurability  of  two  periods. 

A  large  number  of  papers  have  been  devoted  to  these  degenerate 
motions  with  the  aim  of  ensuring  a  solid  mechanical  support  to  the 
8}r8tematic  “Aiombau”  of  Sommerfeld,  emphasized  at  the  beginning  of 
this  lecture. 

12.  BirkhofPs  severity  against  canonical  variables  and  methods — 
Apology  for  a  milder  attitude. 

Professor  BirkhofF  has  in  many  occasions  manifested  his  scepticism 
about  the  magic  virtues  of  the  canonical  form  and  transformations. 
Indeed  the  most  profound  questions  connected  with  a  given  canonical 
system,  * 


(12.1) 


dt  bqk  dt  dpk 


n), 


'*  Adiabatic  invariants  and  the  theory  of  quanta,  Phil.  Mag.,  vol.  XXXIII, 
1917,  pp.  500-513. 


36 


TULLIO  LEVI-CIVITA 


as  existence  of  single-valued  integrals,  existence  of  integral  invariants, 
stability  of  a  given  solution  and  the  like,  have  invariant  character  with 
respect  to  any  single-valued  transformation  between  the  two  ennuples 
p,  q  and  any  set  of  2n  independent  functions  of  them 

^If  ^tf  •  •  •  t 

and  not  only  with  respect  to  canonical  transformations. 

As  (12. 1)  is  the  associated  system^*  of  the  linear  form  (pfafiian) 

n 

(12.2)  ^-“2 

I  * 

to  get  its  transform  in  general  coordinates  xi,  xt,  . Xt«,  we  need  only 
to  determine  explicitly  the  expression 

Sn 

(12.3) 

I 

taken  by  4'd  in  the  new  variables,  and  then  to  pass  to  the  corresponding 
associated  system. 

Canonical  transformations  are  those  for  which  [the  variables  x  being 
separated  in  two  series  p\,  q\  {h  *■  1,  2,  . . .,  n)]  the  pfaffian  preserves 
its  canonical  form 

n 

•  I 

Professor  Birkhoff  thinks  that  the  premeditated  limitation  to  this  par¬ 
ticular  group  of  transformations  “may  be  regarded  as  a  mere  exercise 
in  analytical  ingenuity.”  In  my  opinion  this  hard  sentence  deserves 
attenuation,  and  at  least  temporary  emendnemt. 

To  win  a  clear  insight  into  the  matter  let  us  recall  another  point  of 
view,  which  in  some  instances  (especially  in  the  theory  of  adiabatic  in¬ 
variants)  may  be  essential. 

Firstly  indicate  by  U  the  general  group  of  all  transformations,  which 
behave  uniformly  (within  the  required  field);  by  C  its  canonical  sub¬ 
group.  Next  introduce,  besides  U,  another  group  D  formed  by  the 
operations  belonging  to  a  certain  field  of  rationality,  in  the  largest 
acception,  resulting  from  the  profound  work  of  Drach:‘^  it  is  substantially 

Compare  e.g.  Lcvi-Civita  and  Aiialdi,  Lesioni  di  meccanica  r€uionale,  Vol. 
II  (Seconda  parte),  Bologna,  Zanichelli,  1927,  p.  309. 

An  account  is  to  be  found  in  bis  paper  “Sur  I'iniigration  logique  det  Equations 
difftrentirlln  wdinairra,  in  the  Proc.  of  the  fifth  Int.  Congr.  of  Math.,  Vol.  I 
(Cambridge;  at  the  University  Press,  1913),  pp.  438-497. 


THEORY  OF  ADIABATIC  INVARIANTS 


37 


the  group  to  which  the  formal  researches  of  the  old  analysts  appealed, 
concerning,  say,  integration  by  a  certain  number  of  quadratures,  reduc¬ 
tion  of  the  order  of  differential  systems,  and  the  like.  The  group  D, 
unlike  C,  is  not  in  general  a  subgroup  of  U. 

Nobody  may  disagree  with  professor  Birkhoff  in  thinking  that  the 
questions  dominated  by  group  U  are  generally  far  more  important  than 
those  which  depend  upon  D.  And  from  this  point  of  view  to  restrict 
oneself  beforehand  on  the  subgroup  C  may  be  an  artificial  and  some¬ 
time  even  disconcerting  policy. 

But  on  the  other  hand,  to  account  for  possible  reduction  of  differential 
order  in  a  given  dynamical  problem,  or  especially  to  procure  (in  the  few 
integrable  cases)  the  complete  resolution  by  quadratures,  group  D  is 
dominant.  Now,  for  canonical  systems,  the  classical  theorems  of 
Liouville  and  Lie  allow,  by  means  of  a  certain  number,  say  m,  of  known 
involutory  integrals,  a  reduction  of  order  amounting  to  2m  units;  while 
for  general  systems  and  general  integrals,  the  analogous  reduction  is 
only  of  m  units.  Such  a  fundamental  property  is  obviously  invariant 
with  respect  to  the  general  group  U ;  but,  if  we  start  from  a  8)r8tem 
associated  to  a  generic  pfaflian,  not  beforehand  reduced  to  the  canonical 
form,  we  are  not  able  (at  least  till  now,  though  the  matter  might  be 
f urt  her  invest  igated) : 

1)  To  recognize  whether  m  known  integrals  are  or  are  not  in  in¬ 
volution; 

2)  To  accomplish  the  effective  reduction  of  2m  units  in  the  most 
favourable  case. 

Perhaps  this  would  require,  like  the  previous  reduction  of  (12.3)  to 
the  canonical  form,  operations  of  a  high  analytical  rank,  sometimes 
even  higher  than  the  integration  of  the  differential  83rstem  itself;  at  any 
rate  not  barely  in  finite  terms,  as  are  needed,  when  the  system  and  inte¬ 
grals  are  expressed  in  canonical  variables,  to  discriminate  the  involutory 
character  of  the  m  given  integrals  and  to  accomplish  in  this  case  the  prac¬ 
tical  order-reduction  of  2m  units. 

13.  Application  of  the  same  argument  to  adiabatic  invariants. 

Similar  circumstances  occur  in  the  here  outlined  theory  of  adiabatic 
invariants.  This  concept  is  indeed  invariant  under  the  general  group 
U  and  therefore  the  abstract  theory  might  be  set  on  foot  with  refer¬ 
ence  to  a  general  differential  system,  instead  of  devoting  prevalent  atten¬ 
tion  to  Liouville’s  or  canonical  type.  Here  again  there  is  a  similar 
justification,  namely  that  in  general  considerable  difficulties  hinder  the 
explicit  construction  of  adiabatic  invariants. 


38 


TULLIO  LEVI-CIVITA 


As  a  simple  and  instructive  illustration  of  this  occurrence  let  us  con¬ 
sider  a  Liouville’s  system  with  adiabatic  parameters  a: 

(13.1)  X,(x|/|a)  (r-1,2 . N), 

at 

with 


(13.2) 


and  the  single-valued  integral  (representing  closed  hypersurfaces) 


(13.3) 


F(xla)  -  E. 


As  we  know  (numbers  3,  8),  the  euclidean  extension  V  of  the  field  C 
interior  to  an  hypersurface  (13 . 3)  is  an  adiabatic  invariant.  With  these 
data  the  evaluation  of  V'  requires  only  an  (Ar-l)-fold  quadrature. 

Suppose  now  that  we  are  given  the  same  system  (13.1),  referred  how¬ 
ever  to  general  coordinates  (i»  *  1,2,  N)x 

(13.4)  ^-S,((|<|a)  (v-  1,2,  ...,JV) 

at 


The  condition  of  vanishing  divergence. 


will  in  general  not  be  preserved ;  and  the  adiabatic  invariant  V,  referred 
to  the  (’s  will  have  the  expression 

D  denoting  the  functional  determinant  of  the  former  variables  x’s  with 
respect  to  the  {’s.  The  transformation  being  unknown,  we  must  search 
for  D  just  appealing  to  the  property  of  V  of  being  adiabatic,  and  in 
particular  an  integral  invariant,  which  last  property  is  equivalent  for  D 
to  be  a  multiplier  of  (13.4),  that  is  a  solution  of  the  partial  differential 
equation 


THEORY  OF  ADIABATIC  INVARIANTS 


39 


You  see  that  the  construction  of  V'  requires  in  the  general  case  the 
preliminary  determination  of  a  solution  D  of  (13.5),  which,  in  operative 
classiScation,  is  of  a  rank  much  above  ordinary  quadratures. 

14.  Indication  of  some  recent  contrdmtions. 

The  mathematical  theory  of  adiabatic  invariants  has  received  a  valu¬ 
able  improvement  by  Geppert.**  He  has  put  and  resolved  the  follow¬ 
ing  question: 

Given  a  general  differential  system 

(14.1)  ^.r(,l(|a)  (k-1,2 . N) 

dt 

with  parameters  a,  suppose  that, /or  constant  o’s, 

(14.2)  F.(illla)  -  c.  (o  “  1,2,  .  .  .,m) 

are  its  independent  single-valued  integrals:  their  number  m  may  range 
from  0,  if  no  such  integral  is  admitted,  to  N  in  the  most  favourable  case. 

What  conditions  are  to  be  verified  by  a  function  H'(c|a),  in  order  that 
it  be  an  adiabatic  invariant  of  (14.1)?  Obviously,  in  this  requirement 
some  averaging  is  implicit,  and  the  existence  of  an  asymptotic  time- 
mean  corresponding  to  this  averaging  must  be  assumed. 

On  these  lines  Geppert  has  established  a  system  of  characteristic 
partial  differential  equations  in  which  the  independent  variables  are 
the  a’s  and  the  c’s.  Every  solution  W  (if  any)  of  this  system  gives  an 
adiabatic  invariant.  The  discussion  of  this  system  forms  the  main  part 
of  Geppert ’s  research.  He  has  also  introduced  the  notion  of  relative 
adiabatic  invariants,  an  illustrative  example  of  which  is  afforded  by 
damped  oscillators. 

For  canonical  systems  (and  their  transforms)  which  arise  from  a 
variational  principle,  Mattioli**  has  proposed  a  quite  different  treatment 
of  adiabatic  invariants,  resting  on  the  following  idea.  If  in  the  varia¬ 
tional  formula,  summing  up  the  differential  system,  any  involved  param¬ 
eters  a  are  submitted  to  adiabatic  changes,  and  correspondingly  varied, 
some  tmubling,  additional  terms  are  m  general  introduced.  But  these 
disappear,  if  the  adiabatic  process  is  simply  linear  in  time.  With  this 
restriction  it  becomes  possible  to  ascertain  comprehensively  from  varia- 

**  Theorie  der  adiabatiachen  Inrarianten  allgemeiner  DifferenticUayslemf,  Math. 
Ann.,  B.  102, 1929,  pp.  194-243. 

'*  Principi  varicuionali  e  traaformasioni  adiabatiche,  Ann.  di  Mat.,  T.  X.,  19.32, 
pp.  283-328. 


40 


TULLIO  LEVI-CIVITA 


tional  equations  the  most  significant  adiabatic  invariances  already 
known. 

Finally  we  may  notice  that  a  successful  attempt  has  been  made  in 
order  to  apply  the  general  theory  of  adiabatic  invariants  to  prediction 
in  the  utmost  planetary  evolution.  But  this  manner  of  attack  is  now 
overcome,  a  much  simpler  and  far  reaching  method  having  been  soon 
after  proposed  by  Krall.*® 

*•  See  my  lecture  at  Brown  Univemity  “Secular  effects  of  tides  on  the  motion  of 
a  planetary  system,"  to  be  published  in  the  Tranaaclion$  of  the  American  Math¬ 
ematical  Society. 


A  FORMAL  THEOREM  ON  THE  DERIVATIVE'S  OF  A  SERH:S 
OF  ZONAL*  HARMONIC'S 

Br  Robert  F.  H.  Chao' 

In  potential  theory,  we  have  many  potential  functions  which  have 
the  following  series  developments.** 

(la)  ^  (■)  ^-(w)  forr>a 

•  -0 

(lb)  PoM  forr<a 

■  -0  '  ' 

where  An  and  a  are  constants 
u  »*  cos  0 

r  and  0  are  polar  coordinates 
P»(u)  are  zonal  harmonics. 

In  this  paper,  first,  a  formal  theorem  on  the  first  partial  derivatives 
with  respect  to  x  and  y  of  (la)  and  (lb)  will  be  given;  secondly,  the 
question  of  convergence  of  the  resulting  series  will  be  considered  for  an 
important  special  case;  and  finally,  the  formulas  for  the  magnetic  field 
of  a  circular  current  will  be  given  as  an  example. 


Theorem  la:  Let 

■  -0  '  ' 

(la) 

then 

1 

H 

i2nA.(2)>.(u) 

(2a) 

N 

1 

A. 

H  ■  1  '  ' 

(3a) 

Theorem  Ib:  I..et 

b-Sa.I 

■  •0 

Q-p.w 

(lb) 

'  E)epartmentof  Matbemstics,  National  Tsing  Hua  University,  Peiping,  China. 
*  See  numerous  examples  in  "Fourier’s  Series  and  Spherical  Harm«)nics’'  by 
W.  E.  Byerly,  Chapter  V,  Art.  78,  79, 80,  and  94.  Note  that  in  all  these  examples 

the  powers  of  ~  or  ~  are  either  all  odd  or  all  even. 

41 


42 


ROBERT  F.  H.  CHAO 


then  H, 

H, 


(2b) 

(3b) 


To  prove  these  theorems  we  shall  use  the  following  properties  of  lonal 
harmonics.-* 

(a)  (1  —  u*)P'(u)  +  nu  P«(tt)  -  nPn-iiv) 

(b)  (n  -1-  l)P,  +  i(u)  -  (2n  +  1)mP,  (u)  +  nP,_i  (u)  »  0 

(c)  PU,(u)~uP:(u)-h(n  +  l)P,(u) 

(d)  u  P'  (u)  -  P^_i  (u)  -  n  P  (u). 


Proof  of  Theorem  la: 


u  ■«  cos# 


du  1  ,v  ^  yw 


By  (a), 


-  (n  +  l)uP«  (u)| 

(1  -  u»)  P^  (u)  »  n  P,_1  (u)  -  nu  P,  (u) 


(u) 


^  +  l)uPn(u)j 

^  Mao  '  ^ 

By  (b)  *  r  S  (n  +  1)  P«  +  i.(«) 

»-o  '  ' 

\ 

^  l(«  +  l)^»(**)  +  wP.I(**)| 


•  See  "FoundationB  of  Potential  Theory”  by  O.  D.  Kellomt,  pp.  126-128. 


DERIVATIVES  OF  ZONAL  HARMONICS 


43 


By(c)  “  + 

■-0  '  ' 

w  —  1  '  ' 


(3a) 


Proof  of  Theorem  Ib: 


By 


1  /  r\"~> 

“oS"^"(o)  -«*)P,(u)  -|-nuP,(u)} 

«  « 1  '  ' 

(a)  1(«)  (2b) 

00  y  . 

-  ^  (a)  l”^"(“)  “  «^-.'(a)l 


ay 


fi  •  1  '  ^ 


(3b) 


Question  of  Convergence:  We  shall  consider  only  the  important  special 


case  when  the  powers 


°'(r)“'(a) 


occurring  in  (la)  or  (lb)  are  either 


all  odd  or  all  even.  We  shall  assume  that 

,  .  I  ■^"  +  *  I  c  1 

Lim  -T~: — r  ^  1. 


We  shall  use  the  following  properties  of  zonal  harmonics:* 

(e)  Maximum  |  P,(u)  1  for  real  u  in  (—  1, 1)  »  Pi,(l)  *  1 

(f)  Maximum  |  Pn(u)  [  for  real  u  in  (—1,  1)  =»  |  P,I(1)  |  — 

(g)  i’:+*(u)  -  (2n  +  l)P,+,(u)  +  P:(u). 

To  show  the  convergence  of  (2a),  consider  the  series 


(4) 


where  the  powers 


fi  ■  I  '  ' 


(S' 


-C) 


are  all  odd  or  all  even  according  as  the  powers  of 


in  (2a)  are  all  odd  or  all  even. 


44 


ROBERT  F.  H.  CHAO 


Let  L\  be  the  nth  term  of  the  series,  then 

I  Vn-¥t  I  (w  -f-  2)  I  I  / a\* 

1  t-.  1  "  n  \A,\  \r) 

which  approaches  a  limit  <  1  as  n  becomes  infinite  if  r  >  a.  Hence 
the  series  (4)  oonvernes  by  ratio  test. 

But  by  (e),  |  P,(u)  |  S  L  therefore  by  Weierstrass  .V-test,  using  (4) 
as  the  test  series,  the  series  (2a)  converges  uniformly  for  —  1  ^  u  1 
and  r  ^  a  +  c,  where  c  is  an  arbitrarily  small  positive  number. 

Similarly,  the  series  (2b)  converges  uniformly  for  —  1  ^  u  ^  1  and 
r  ^  a  —  e. 

To  show  the  convergence  of  (3a),  consider  the  series 

(5)  2  M. !  kid)  1  (;)‘ 

I  \  Pistil)  I  /a\t 

I  I/,  I  “  \A.\  |p:(1)|  \r) 


By  (g) 


PU»(u) 

P:(u)  ‘ 

I  PUifu) 

I  P.(u)  1 


1  +  (2n  +  1) 


^  1  -f  (2n  +  1) 


Pn  +  ,(u) 

p:(u) 

P.^i(u) 


P'Ju) 


For  u  —  1,  we  have,  using  (e)  and  (f). 


I  p:^,(i)  I 
|P-(i)i 


g  1  +  (2n  +  1) 


2 _ 

n(n  +  1) 


As  n  becomes  infinite,  we  have, 


Lim 

fl  •  00 


I  P:^i(1)  I 
ip:(i)i 


Hence 


Lim 

n  ■  00 


1 


<  1 


for  r  >  a. 


Hence  the  series  (5)  converges  for  r  >  a. 


DERIVATIVES  OF  ZONAL  HARMONICS 


45 


But  by  (f),  I  P'(u)  I  ^  I  I,  therefore  by  Weierstrass  -V-test, 
using  (5)  as  the  test-series,  (3a)  converges  uniformly  for  —  1  u  ^  1 
and  r  ^  a  +  (. 

Similarly,  (3b)  converges  uniformly  for  —  1  ^  u  ^  1  and  r  ^  o  —  e. 
Example:  The  magnetic  potential  due  to  a  circular  wire  of  radius  a 
carrying  a  current  of  i  electromagnetic  units  is  given  by 


(lb) 


0  -  2W  |l  -  ^  P,M  +  2  (-  !)”■ 

(")  P*,+i(u)[  for  r  <  o. 


Apply  the  theorems,  we  get:* 
(2a)  H, 


(3a)  {-ly 


3.5  ...  {2n  - 


2.4  .. .  (2n 
1.3  ...  (2n  -  1) /aV" 


Ptn(u)  (r  >  o) 


2.4  . ..  2n 


(u)  (r>ol 


(2b)  H.  -  jl  +  2  (-  •)■  -  Q"  -1 

(3.) 


PuM  lr<  a.) 


Furthermore,  we  know  that  the  series  (2a)  and  (3a)  converge  uni¬ 
formly  for— 1  ^  u  ^  l,r^a-f*  and  the  series  (2b)  and  (3b)  converge 
uniformly  for  —  1  ^  ^  1,  r  g  a  — 


*  See  “Electricity  and  .Magnetism’’  by  J.  H.  Jeans,  5th  edition  p.  432. 

*  See  “The  Magnetic  Field  of  a  Circular  Cylindrical  Coil,’’  by  H.  B.  Dwight, 
Philosophical  Magazine,  vol.  xi,  April,  1931,  p.  951. 


ON  thp:  rational  solutions  of  the  matrix 

EQUATION  P{X)  -  A 
Bt  M.  H.  Inorahaii‘ 

Roth,*  Franklin,*  Rutherford*  and  others  have  studied  the  matrix 
equation 

P{X)  -  A,  I 

where  P(X)  is  a  polynomial  with  coefficients  in  a  field  F,  and  A  is  an 
n  X  n  matrix  with  coefficients  in  F.  Roth  found  the  solutions.  A',  that 
are  polynomials  in  A .  F ranklin  simplified  the  method  of  Roth  and  found 
all  the  rcM>ts  in  the  complex  field  or  any  field  that  contains  all  the  roots  of 
a  certain  finite  set  of  algebraic  equations  with  coefficients  in  F, 

The  purpose  of  this  note  is  to  develop  a  method  of  finding  the  most 
general  matrix  solution  with  elements  in  F  and,  in  particular,  to  find  a 
finite  pr(»cess  for  constructing  this  solution  when  F  is  such  that  the 
solutions  in  F  of  any  ordinary  ri-th  degree  equation  may  be  found  by  a 
finite  process.  All  points  of  interest  are  encountered  in  considering 
the  cases  in  which  F  is  the  rational  field,  and  in  which  F  is  the  complex 
number  system. 

Thnmghout  the  remainder  of  this  paper  the  elements  of  all  matrices, 
and  the  coefficients  of  all  polynomials  involved  are  considered  to  lie  in  a 
field  F.  Two  n  X  n  matrices,  A  and  B,  are  similar  if  the  rational  or  the 
classical  canonical  forms  of  A  and  B  are  identical.  The  classical  canoni¬ 
cal  forms  of  A  is  a  matrix  zero  except  for  principal  minor  matrices  of 
order  1  or  of  order  r  of  the  form 


«< 

0 

0 

...  0 

0 

1 

Oi 

0 

...  0 

0 

0 

1 

o. 

...  0 

0 

...  Oi 

0 

0 

0 

0 

...  1 

Oi 

'  Frofessor  of  Mathematics,  University  of  Wisconsin,  Madison,  Wisconsin. 
Ed. 

*  Transactions  American  Mathematical  Society,  Vol.  30  (1928),  pp.  579-596. 

'  Journal  of  .Mathematics  and  Physics,  Massachusetts  Institute  of  Technology, 
Vol.  10  (1932),  pp.  289-314. 

*  Proceedings  Edinburgh  Mathematical  Society,  II,  Vol.  3  (1932),  pp.  135-143. 


SOLUTIONS  OF  MATRIX  EQUATION  P(X)  -  A  47 

where  the  a<  are  not  necessarily  in  F.  It  is  clearly  seen  that  the  classical 
canonical  form  is  completely  determined  from  the  determination  of  the 
rank  of  (^4  —  X/)”  for  all  positive  integers  n  and  all  X's  in  a  field  con¬ 
taining  the  roots  of  g(x),  the  characteristic  function  of  v4.  It  can  be 
readily  shown  that  this  is  equivalent  to  the  determination  of  the  rank  of 
(/(i4)]*  for  all  positive  integers  n  and  all  polynomials  with  coefficients 
in  F  which  are  irreducible  in  F. 

Consider  the  equation  P(X)  A  when  ^4  is  an  n  X  n  matrix  and  P  a 
polynomial  with  coefficients  in  F.  If  for  a  matrix  Y,  P(Y)  is  similar  to 
A ,  then  there  exists  a  non-singular  matrix  T  such  that  A  »  TP(  Y)T~^^ 
P{TYT~').  Hence,  if  we  can  find  a  system  of  dissimilar  matrices  Ki, 
Fj  , . .  Yk,  such  that  P(Fi)  is  similar  to  A  for  every  t  and  such  that 
every  matrix  F  for  which  P(F)  is  similar  to  A,  is  similar  to  some  F*,  we 
can  find  a  system  of  dissimilar  matrices  Xi,  Xj  . . .  Xt,  such  that  P{Xi) 
«  A ,  and  such  that  every  matrix  X  such  that  P(X)  —  A  is  similar  to 
some  Xi.  In  fact,  the  complete  system  of  solutions  will  be  of  the  form 
SXi«S“‘,  where  S  is  commutative  with  A. 

We  now  develop  a  method  for  finding  k,  and  Fi,  Fj  . . .  F*. 

Suppose  the  invariant  factors  of  the  characteristic  matrix  of  an  n  X  n 
matrix  F  to  be 

K.W  -  (/iW)'"(/.  («)■"...  a.w)'" 

«.(x)  -  (/.(x))~(/.(x))""  •  •  •  (/.(x))"' 

where  the  distinct  polynomials  ft  are  irreducible  in  F  and  m.-/  >  mi  4.  1.  / 
(i,j,  1).  Let  the  degree  of  fi  be  «i,  then  2i  «i  2/  m{/  =■  n  the  degree  of  the 
characteristic  functions  of  F.  Consider  any  pol3momial  Q  with  coeffi¬ 
cients  in  F.  Let 

Q(x)  -  (/i(x)"'(/,(x))"‘...  (/.(x))"'0.(x). 

where  Qi  is  relatively  prime  to  gi,  and  let  *  2  «im„  (sum  as  to  1,  j 
for  all  mu  <  rii)  and  t  *  2  n,«iPi,  where  pi  is  the  number  of  mi,-  >  rii. 
Then  the  rank  of  Q{Y)  will  be  n  —  <r  —  t.  This  is  clear  from  a  consider¬ 
ation  of  the  classical  canonical  form,  for  if  in  Fi  the  extension  of  F  which 
contains  the  roots  of  gu  /i(X)  «■  (X  —  Xi,)  (X  —  X*)  ...  (X  —  Xi,i), 
then  since  the  fi  are  irreducible  and  hence  Xi/  distinct  the  elementary 
divisors  of  the  characteristic  matrix  of  F  are  (X  —  X,/)"«  (i,  k,  j  <  «i), 
and  the  above  theorem  follows  immediately  from  the  same  obvious 
theorem  for  elementary  divisors  in  Pi. 

Let  the  invariant  factors  of  (A  —  X7)  be 


bi(X)  -  n,(c,(X))"‘' 


48 


M.  H.  INGRAHAM 


where,  as  above,  the  c’s  are  irreducible  in  F  and  distinct  and  nt,  > 
n<-t-  k,  i  and  c/  is  of  decree  ry,  then  if  P(  K)  is  to  be  similar  to  A ,  the /<  and 
the  mu  must  be  so  chosen  that  Q(P()’))  is  of  the  same  rank  as  Q(A) 
for  all  Q. 

From  the  above  it  is  seen  that  it  is  sufficient  if  we  limit  Q  to  be  a 
power  of  an  irreducible  polynomial  cy  whose  exponents  are  not  greater  than 
niy.  If  Q(X)  -  (c,(X))",  then  the  rank  ry«  of  Q(A)  is  n  —  ffi  —  ti,  where 
9i  *  Ik  nifTf  summed  for  all  k  for  which  n^y  <  m  and  it  —  P{fnrf, 
where  py  is  the  number  of  n*y  >  m. 

liet  Cy(P(X))  ■  n(9y*(X))"/*  where  the  are  irretlucible  and  distinct, 
then 

(c,(P(x)))-  -  n(9,*(x))"“'». 

If  the  rank  of  (cjXPfF)))"*  is  to  be  ry«,  then  the  m,y  must  be  so  chosen  that 
if  using  the  above  notation  we  let  /,  «  q^  and  ny  ■*  muy„  then 

<r  -b  T  -  ffi  -f-  Ti  O',  m).  II 

It  being  remembered  that  a,  <ri,  r,  T|,  are  dependent  on  j  and  m.  This 
yields  us  a  system  of  diophantine  equations,  each  solution  of  which 
yields  us  a  Yy  and  hence  a  solution  Xi  of  the  original  equation  and  such 
that  the  Xi  are  dissimilar  and  every  solution  is  similar  to  some  Xi. 
The  number  of  solutions  of  the  system  II  is  k  the  number  of  dissimilar 
solutions  of  I. 

Example:  I^t  F  be  the  field  of  rational  numbers. 

Consider  x*  -f-  2t  »  ^4,  (1),  where 


-1 

0 

0 

2 

2 

0 

0 

1 

0 

0 

0 

0 

-2 

0 

0 

-1 

\A  -  X/|  -  X*(X*  -f  2X  +  5) 

6,(X)  -  X(X»  +  2X  +  5),  bi(X)  ~  X 

c,(X)  -  X,  c,(X)  -  X*  -I-  2X  +  5 

r,  -  1,  r,  -  2 

nil  “  ntt  “  nji  »  1,  rin  “  0 

c,(P(X))  »  X*  +  2X  -  X(X  +  2) 

c,(P(X))  -  X*  +  4X*  +  6X*  +  4X  +  5  -  (X*  +  1)  (X*  +  4X  +  5) 


SOLUTIONS  OF  MATRIX  EQUATION  P(X)  -  A 


49 


let 

U\)  -  X-’-MX  +  2)"-(X  +  1)"”*(X  +  4X  +  5)"*‘ 
for  j  —  l,Tn  «  1  equation  II  beconr.e? 

<r  +  T  *  2 

forj  *  l,m  =«  2  equation  II  beQ)me8 

(T  +  r  »  2, 

the  three  possible  solutions  bein^ 

ntn  «  m,,  -  1,  mu  =  m„  -  1,  mu  -  mi,  -  1. 

Similarly,  for  j  *  2,  m  ->  1  we  have 

<r  +  T  -  2 

and  for  j  —  2,  m  «*  2  we  have 

-f-  T  *  2. 

Hence 

mu  *  1  or  mu  *  1 

and  the  remaining  mi/  are  all  0.  Hence  the  invariant  factors  of  the 
characteristic  matrices  of  any  set  of  six  dissimilar  solutions  are 

X(X*  +  1),  X;  (X  +  2)  (X>  +  1),  X  -H  2;  X(X*  +  4X  +  5),  X; 

(X  +  2)  (X*  +  4X  +  5),X  +  2;X(X  +  2)  (X*  +  1);X(X  +  2)  (X*  +  4X  +  5). 

To  illustrate  the  complete  work  the  6rst  of  these  solutions  will  be  found. 
Let 


0 

1 

0 

0 

0 

0 

1 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

2 

1 

0 

0 

-1 

2 

0 

0 

-2 

-1 

0 

y'ji  ^  21^1 


50 


M.  H.  INGRAHAM 


If 


then 


and  then 


0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

T{Yi'  +  2YSt 

- 

A 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

-1 

0 

0 

0 

X,  -  TYxT-^ 


ia  a  solution.  The  matrices  Xi  —  SXiS~'‘,  where  S  is  commutative 
with  A,  are  the  solutions  of  (1)  similar  to  X|. 


■m 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 

Bt  Eberhard  Hope 

Introdiiclion.  Everybody  is  familiar  with  certain  frequency  phenom¬ 
ena  which  are  connected  with  the  repeated  turning  of  a  roulette  wheel, 
the  tossing  of  coin  or  a  die  and  with  Buffon’s  needle  experiment.  What 
is  the  theoretical  explanation  of  these  experimental  phenomena?  The 
solution  of  this  question  should,  naturally,  throw  much  light  on  the  true 
origin  of  the  fundamental  laws  of  probability.  Poincai^*  was  the  first 
to  make  valuable  contributions  in  this  direction  (shuffling  of  cards, 
roulette).  His  remarks  have  given  rise  to  a  new  branch  of  the  theory  of 
probability,  to  the  method  of  arbitrary  functions.  Suppose  that  a 
roulette  wheel  is  spun  a  large  number  of  times.  The  frequency  with 
which  the  wheel  comes  to  lie  within  a  small  sector  dfi  will  be  represented 
by  a  certain  function  /(^).  The  large  number  of  alternatively  red  and 
black  sectors  then  makes  the  frequencies  of  red  and  black  approx¬ 
imately  equal,  independent  of  f(<p).  A  more  detailed  study  of  the 
roulette  shows,  moreover,-  that  after  sufficiently  rapid  spinning  even  the 
individual  sectors  appear  with  nearly  equal  relative  frequency,  small 
deviations  in  the  initial  velocities  causing  large  differences  in  the  final 
positions. 

The  method  of  arbitrary  functions  has  led  to  many  beautiful  results 
(Hadamard,  Hostinsky,  v.  Mises).*  Most  of  these  results  concern, 
however,  the  so  called  Markoff  chains,  i.e.  iterations  of  a  process  in 
which  the  transition  from  phase  to  phase  is  not  strictly  causal  but 
regulated  by  a  law  of  probability.  Important  and  natural  as  these 
investigations  are  for  many  applications,  they  cannot  give  insight  into 
the  true  origin  of  the  laws  of  probability.  On  the  contrary,  they  are 
based  upon  them. 

In  1918,  the  Polish  physicist  M.  v.  8moluchowsk3r*  pointed  out,  with 
greater  emphasis  and  more  detailed  reasoning  than  Poincar^,  how 
accurately  the  concept  of  chance  may  be  defined  and  how  naturally 

*  Calcul  des  Probability ’s,  Paria  1912,  Introduction. 

*  See,  for  instance,  R.  v.  Mises,  Wahrscheiniichkeitsrechnung,  Leipzig  1931, 
§16. 

*  Die  Naturwissenschaften,  1918,  Planck-Fostschrift. 

See  also  a  recent  article  by  D.  T.  Struik,  Philosophy  of  Science  I  (1934),  p.  50. 

81 


52 


EBERHARO  HOPE 


the  fundamental  laws  of  probability  may  be  derived,  once  frequency 
phenomena  are  recognized  as  produced  by  strictly  causal  mechanisms. 
Let  us  drop  a  coin  from  the  height  of  one  meter  above  the  floor.  Its 
final  position  on  the  floor  will  be  a  definite  function  of  the  initial  phase 
(position  and  velocity).  Slight  changes  of  the  initial  phase  will  cause  a 
completely  different  effect  (head  up  instead  of  tail  up),  ('orresponding 
to  the  two  possible  events,  the  phase  space  will  be  divided  into  two  parts 
//,  T  which  contribute  about  equally  in  measure  to  all  (not  too  small) 
regions.  If  the  coin  is  dn>pped  a  large  number  of  times  and  if  we  de¬ 
scribe  the  different  initial  phases  by  a  continuous  distribution  function, 
the  relative  frequencies  of  H  and  T  turn  out  to  be  nearly  equal,  inde¬ 
pendent  of  that  function.  This  supposes  only  that  the  distribution 
function  is  not  too  irregular.  The  use  of  such  a  function  is,  of  course, 
based  upon  a  fiction  as,  in  reality,  the  set  of  initial  phases  is  always 
countable.  From  the  way  H  and  T  are  distributed  it  is,  however, 
plausible  that  the  relative  frequencies  will  be  approximately  equal 
within  ‘most’  sequences  (the  true  law  of  large  numbers). 

A  rigorous  mathematical  theory  based  upon  such  approximate 
notions  would  very  much  lack  simplicity,  just  as  would  a  geometry 
that  should  deal  with  material  instead  of  geometrical  curves.  Since 
strictly  homogeneous  point  sets  do  not  exist,  apart  from  trivial  excep¬ 
tions,  one  or  several  parameters  must  occur  in  the  problem.  When 
these  parameters  tend  toward  certain  limits,  the  regions  ^rresponding 
to  the  different  events  will  gradually  become  distributed  in  a  homo¬ 
geneous  w’ay.  More  precisely,  the  relative  measure  of  the  part  of  A 
(the  region  corresponding  to  the  event  A)  in  common  with  an  arbi¬ 
trarily  given  region  B  will  approach  a  number 


L{A) 

independent  of  B.  L(A)  may  be  called  the  relative  frequency  of  the 
event  A  with  respect  to  the  mechanism  considered.  L  will,  further¬ 
more,  be  independent  of  the  measure  used  in  deriving  L.  A  strict 
frequenc>'  phenomenon  can  thus  only  be  regarded  as  a  limit  case  which 
is  repre8ente<l  in  nature  with  a  smaller  or  larger  degree  of  accuracy. 

Parameters  are,  however,  not  a  mere  mathematical  requirement. 
In  con8er\'ative  mechanisms,  distribution  effects  become  more  pro¬ 
nounced  after  a  sufficiently  long  time.  The  roulette  wheel  must  be 
spun  rapidly,  the  coin  must  be  released  with  sufficient  energy  and  the 
dice  in  the  box  must  be  shaken  vigorously.  It  is,  however,  more  con¬ 
venient  for  our  purposes,  to  consider  parameters  which  are  indepen- 


*T 


J 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


53 


dent  of  the  initial  phase,  such  as  the  coefficient  of  friction  n  and  the 
modulus  of  elasticity  e.  m  And  1  —  e  must  be  small  if  the  eneixy 
range  be  fixed  (the  mechanism  must  be  nearly  conservative).  These 
frequency  phenomena  can  then  all  be  brought  under  the  above  com¬ 
mon  scheme.  There  is  a  certain  space  of  initial  operations  and  a  family 
of  sets  in  this  space.  The  problem  is  to  find  out  if  these  sets  tend  to 
distribute  themselves  in  a  homogeneous  way  when  the  parameter  tends 
to  its  limit. 

Although  the  frequency  phenomena  in  question  are  produced  by 
dissipative  mechanisms,  the  greater  part  of  this  paper  is  devoted  to 
conservative  mechanisms.  The  reader  will,  however,  recognise  that 
the  independence  theorem,  the  frequency  theorem  and  its  generalisation 
have  nothing  to  do  with  this  restriction.  A  further  reason  for  such  an 
extensive  limitation  to  conservative  mechanisms  is  that  the  paper  also 
deals  with  general  distribution  phenomena,  with  the  ergodic  theory  and 
with  the  theory  of  mixture. 

A  few  introductory  sections  on  the  general  theory  of  measure  cannot 
be  avoided.  The  reader  may,  however,  start  from  section  5  and  use 
the  preceding  ones  for  reference.  I  should  not  forget  to  mention  my 
indebtedness  to  my  colleagues,  D.  J.  Struik  and  N.  Wiener,  without 
whose  initial  stimulus  this  paper  would  not  have  been  written. 


,  Table  of  eon  tenia 

1.  Fields  of  point  sets  and  meMures . 5.3 

2.  Integrals.  An  approximation  theorem .  56 

3.  Integration  in  product  spaces . 57 

4.  The  product  space  of  denumerahly  many  spaces .  00 

5.  Conservative  mechanisms .  63 

6.  Long  run  statistics .  65 

7.  Frequency-  and  distribution-phenomena . 68 

8.  A  mixing  mechanism .  72 

9.  The  independence  theorem .  76 

10.  The  coin  problem  and  the  die  box .  78 

11.  The  ergodic  theory .  82 

12.  Mixture  and  statistic  regularity .  89 

13.  Frequency  in  sequences .  94 

14.  Dissipative  mechanisms.  Roulette  and  Buffou’s  needle . 98 

1.  Fields  of  Point  Sets  and  Measures.*  We  consider  a  totality  Oof 
points  P.  Sets  of  points  of  12  are  denoted  hy  A,  B,  ....  A  system  of 


such  sets  is  called  a  field  If  (Korper)  of  point  sets  if,  with  two  sets, 

*  In  sections  1,  4  we  follow  closely  p.  13  to  16  and  24  to  30  of  A.  Kolmogoroff, 
Grundhegriffe  der  Wahmcheinlichkeitsrechnung.  Berlin,  Springer,  1933. 


54 


EBERHARD  HOPE 


A  CZy,B  C.y,  the  sum  A  B,  the  Durchschnitt  A  B  and  the  diflferenoe 
A  —  B  also  belonKS  to  7.  We  suppose  that  Q  itself  belonits  to  the 
field  y.  Let  us  consider  sn  additive  point  set  function  m{A),  which  is 
defined  for  all  sets  A  C.y  and  which  satisfies  the  conditions 

a.  additivity.  m{A  +  B)  —  tn{A)  tn(B),  if  A  B  ^  0. 

b.  rn(A)  ^  0. 

c.  0  <  m(n)  <  *. 

Such  a  function  tn(A)  may  also  be  called  a  finite  measure  on  (2. 

A  field  y  is  called  a  Borel-field  of  point  sets  if,  in  addition  to  the 
ordinary  field  postulates,  the  sum  of  denumerably  many  elements  of  7 
is  again  an  element  of  7.  Other  names  are:  absolutely  additive  system 
of  point  sets,  (r-field  (Hahn). 

It  is  well  known  that  there  always  exists  a  smallest  Borel-field  B7 
containing  a  given  ordinary  field  7.  B7  is  called  the  Borel-extension 
of  7.  The  most  important  example  is  the  case  where  7  consists  of  all 
finite  sums  of  “interN’als”  in  an  n-dimcnsional  space,*  while  B7  consists 
of  all  Borel  sets. 

A  finite  measure  m(A)  defined  on  a  Borel-field  is  called  absolutely 
additive  if  in  a<ldition  to  a),  the  addivity  holds  also  for  denumerably 
many  mutually  exclusive  sets, 

a'.  Absolute  additivity. 

»n(2A,)  “  2m(A  J. 

fi  n 

It  is  an  important  question  whether  a  finite  measure  m(A)  defined  on 
an  ordinary  field  7  can  be  extended  onto  B7  in  such  a  way  that  it 
becomes  absolutely  additive  on  By.  An  obvious  necessary  condition 
is  that  m(A)  be  absolutely  additive  on  7  itself, 

a*.  The  absolute  additivity  a')  is  satisfied  whenever  all  An  and  their 
sum  2A  »  belong  to  7. 

n 

This  postulate  is  well  known  to  be  equivalent  to  the  “continuity”  of 
m(A):  For  any  descending  sequence  of  point  sets  A,  of  7  without  a 
common  point,  the  measures  m(A,)  tend  to  zero.  The  postulate  a*), 
together  with  b)  and  e),  is  also  sufficient  to  guarantee  the  possibility  of 
extending  m(A)  onto  B7  in  an  absolutely  additive  way. 

Extension  Theorem.  A  finite  measure  m(A)  on  7  that  is  absolutely 
additive  within  7  can  always  be  extended  onto  B7  in  such  a  way  that 

*  Sums  of  aetnirloaed  (closed  on  the  left)  intervals  when  Q  is  a  segment  of  a 
straight  line,  sums  of  semirlosed  parallelepipeds  in  the  euclidean  space,  or  in  a 
part  of  it. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY  56 

m  18  nowhere  negative  and  absolutely  additive  on  the  whole  of  B7. 
The  extension  is  unique. 

Proof.  Point  sets  that  belong  to  y  shall  be  denoted  by  A  whereas 
elements  of  BT  are  called  B.  For  any  point  set  B,  we  denote  by  m*(B) 
the  greatest  lower  bound  of  all  sums 

SmUJ 

n 

formed  in  all  possible  ways  with  finitely  or  denumerably  many  elements 
A»  of  y  such  that  B  C  2^4,.  m*(B)  is  an  exterior  measure  in  the  sense 
of  Caratheodory.*  When  A  belongs  to  T,  rn*(A)  is  seen  to  coincide  with 
m{A),  for 

AcizA,,Aczy,A,czy 

implies ZAAn  =“  A,  AAn  CLy,A  C  T and  therefore 
2m(/l,)  ^  ^  iniA). 

n  n 

The  right  hand  inequality  follows  in  the  usual  way  from  the  absolute 
additivity  within  T.  We  prove  that  m*  represents  the  required  exten¬ 
sion  of  m  onto  By. 

According  to  Caratheodory,  a  set  B  is  called  measurable  if  the  relation 

m*{W)  -  m*iBW)  +  m*{W  -  BW) 

holds  with  an  arbitrary  point  set  W,  and  the  Caratheodory  measure  ofB 
is  m*B.  It  is  proved  in  Caratheodory ’s  theory  that  the  point  sets  being 
measurable  in  this  sense  form  a  Borel-field.  All  that  is  to  be  proved  is 
therefore  the  measurability  of  the  sets  A  belonging  to  T.  We  cover  an 
arbitrarily  given  point  set  W  with  a  finite  or  denumerable  set  of  point  . 
sets  A,  C  y 

W  CZlAn. 

For  a  given  set  A  C  T  we  have 

AW’  C  SAA,,  W’  -  AW  C  Z(An  -  AA.) 

It  is  an  essential  point  that  every  term  on  both  right  hand  sides  belongs 
to  y.  This  implies 

2:m(A,)  -  2m(AA,)  +  2m(An  —  AA,) 

^  m*(AW)  +  m*{W  -AW), 


Caratheodory,  V'orlesungen  Uber  reelle  Funktionen,  1918,  p.  237-258. 


56 


EBERHARD  HOPF 


hence 

m*{W)  ^  m*iAW)  m*{W  -AW). 

Since  the  opposite  inequality  is  a  fundamental  property  of  an  exterior 
measure,  we  find  that  A  is  measurable  in  the  sense  of  Caratheodory. 
The  fact  that  all  elements  of  ^  are  measurable  implies  the  measurability 
of  all  elements  of  The  uniqueness  of  the  extension  follows  from 
the  minimal  property  of 

2.  Integrals.  An  Approximation  Theorem.  VVe  operate  with  a 
definite  field  of  subsets  of  0,  with  its  Ik>rel-extension  B'^,  and  with  a 
finite  and  absolutely  additive  measure  m  on  B'^.  The  Radon-integral 
of  a  RJ^-measurable  function 

hmdm 

is  then  defined  in  the  same  way  as  the  Lebesque  integral  in  well  known 
special  cases.  As  usually,  R^-summability  of  /(P)  means  its  B"^- 
measurability  together  with  the  finiteness  of  the  integral 

/o|/(P)|dm. 

Consider  another  finite  and  absolutely  additive  measure  m'(A)  on 
Bin  such  that  the  equations 

m(A)  *  0,  m\A)  0 

imply  each  other.  According  to  Radon’s  general  theory,^  there  exists 
an  essentially  unique  weight  function  P(P)  such  that 

m\A)  -  f,F(,P)dm 

holds  for  every  A  C  BJ.  F{P)  is  almost  everywhere  pf)8itive.  Further¬ 
more,  there  is  the  substitution  rule  according  to  which 

laf{P)dm'  -  }JiP)FiP)dm 

holds  for  any  function  /(P),  which  is  summable  in  the  sense  of  the 
measure  m'. 

We  shall  need  another  theorem  which  is  known  in  more  or  less  different 
formulations. 

Approximation  Theorem.  Let  /(P)  be  Bf^-summable.  For  an 
arbitrary  *  >  0  there  exists  a  {^-measurable  function  <p(P)  with  but  a 
finite  number  of  values,  such  that 

’  J.  Radon,  Wiener  Sitzungsber.  Math.-phys.  KIa8se722  (1913),  1299. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY  57 

/o  1  /  -  ^  I  rfm  <  €. 

Proof.  The  theorem  becomes  obvious  when  not  T-measurability 
but  only  BT-measurability  is  demanded  of  We  can,  therefore, 
restrict  ourselves  to  the  case  where  /(P)  has  but  a  finite  number  of 
values.  Finally,  this  case  can  evidently  be  reduced  to  the  case  in  which 
f{P)  takes  the  values  one  and  zero  only.  Let  B  be  the  set  where  /  —  1, 
such  that  /  becomes  the  characteristic  function  of  the  point  set  B. 
We  first  construct  a  covering  set 

B*  -  ZAn  3  B,  .4,  C  y 

such  that 

(1)  m(B*  -  B)<^. 

Furthermore,  we  choose  a  finite  sum 

-  2^. 

I 

such  that 

(2)  ■miB*-A*)<-^ 

(1)  and  (2),  together  with  the  relation 

(B  -  B.4*)  +  {A*  -  BA*)  -  (B  +  .4*)  -  B^l* 
and  with  the  inequalities 

B  -  BA*  (ZB*  -  A*,  A*  -  BA*(Z  B*  -  B 

imply 

m(B  +  A*  -  BA*)  <  €. 

On  denoting  by  ^(P)  the  characteristic  function  of  the  set  i4* (belonging 
to  y),  we  find  that  the  left  hand  term  of  the  last  inequality  equals 

k  \m  -  <P(P)  I  dm, 

which  completes  the  proof.  We  add  an  evident 
Corollary.  If  |/1  <  A/,  ^  can  always  be  chosen  to  satisfy  |  ^  |  < 
M  +  1. 

3.  Integration  in  Product  Spacee.  We  now  consider  two  spaces  0 
resp.  n'  of  points  P  resp,  P'.  Let  m(A)  denote  a  finite  and  absolutely 


‘  I 


58  EBERHARD  HOPE 

additive  measure  defined  on  the  given  Borel  field  7  of  all  Borel  sets  of 
ti.  Like  assumptions  are  made  in  case  of  Q'  with  corresponding  notation 
y\  We  may  regard  pairs 

T  -  (P.  P') 

as  points  of  the  product  space  Q  X  Q'.  Subsets  of  this  product  space 

shall  be  denoted  by  a,  /9, _ The  simplest  of  these  sets  are  the  product 

sets, 

A  X  A'‘,A  c.y,  A'  Cl y. 

The  sets  a  C  12  X  12'  which  are  representable  by  finite  sums  of  mutually 
exclusive  product  sets,  are  well  known  to  form  an  ordinary  field  ?  — 
yy  of  subsets  of  12  X  12'.  We  define  a  finite  measure  (i{a)  on  y  by 
setting 

n(a)  ~  lm(An)m'(A.') 


a  ~  ZA^X  A/,  An  C  y,  A,'  C  7' 

be  a  finite  representation  of  a  by  mutually  exclusive  product  sets. 
ft(a)  is  well  known  to  be  independent  of  the  particular  representation 
of  a. 

In  order  to  apply  the  extension  theorem  w'e  must  show  that  mM 
satisfies  the  postulate  of  continuity.  For  this  purpose  the  fact  is  used 
that  an  element  A  of  7(likewi8e  A'  of  7')  of  positive  measure  contains 
closed  subsets  whose  measure  m  is  arbitrarily  close  to  in{A).  Every 
product  set  X  ^4'  of  positive  measure  n  contains  therefore  closed 
product  sets  with  a  measure  n  arbitrarily  near  fi(A  X  A').  Finally, 
every  set  a  belonging  toi^,  m(«)  >  0,  contains  closed  sets  belonging  to 
y  with  a  measure  arbitrarily  near  m(«).  Let  us  suppose  now  that 

ai  ^  oi  3  at  3  On  CZy, 

lim  tt{an)  “  X  >  0. 

It  is  to  be  proved  that  the  sets  a,  have  a  common  point.  We  choose 
closed  sets  y,  C  a,  belonging  to  7  such  that 


(3) 

On  setting 


Ai(a,  —  y.)  <  X2  ' 


yiyt  ...  7. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


59 


we  infer  from  (3)  and  from 


On  —  in 


7.)  Cl  ^(o.  -  7») 

I 


that 


This  implies 


M(a«)  >  M(aO  -  X/2  ^  X/2  >  0. 

A  decreasing  sequence  of  closed  nonempty  sets  has,  however,  a  common 
point,  c.  q.  e.  d. 

The  extension  theorem  is  thus  applicable  and  furnishes  the  unique 
extension  of  n  onto  B'9.  We  may  now  speak  of  integrals  over  0  X  il\ 

/Qxo'/(’r)dM  -  hh’fiP,  P')dmdm\ 

It  should  be  mentioned  that  Fubini’s  theorem  holds  for  such  integrals. 
Perfectly  analogous  considerations  concern  the  product  space  » 
Oi  X  fit  X  ...  X  of  n  m-spaces, 

»  -  (P.,  Pt . Pn). 

The  Borcl  fields  of  all  Ikirel  subsets  of  Qi  generate  a  field  ?  of  sets  a 
of  n.  The  measures  generate  similarly  a  measure  n  on  the  Borel 
extension  of  Integration  is  defined  in  an  analogous  way, 

/fi/(ir)dM  *  /ui  ...  hnf^Pi . Pn)dmx  . . .  dm,. 

For  later  purposes  we  need  the 

Lemma  1.  For  every  fl^-summable  function  F(t),  and  for  any 
c  >  0,  there  exists  a  finite  sum 

(4)  ^(t)  -  2/.(P0/»(P«)  ...  MPn) 

with  7,-summable  functions  fiiPi)  such  that 

(5)  /a  I  F  -  ^  I  d/i  <  «. 

Corollary.  In  case  |  F  |  <  M,<l>  can  be  chosen  to  satisfy  1^1  <  Af  + 1, 
Proof.  According  to  the  approximation  theorem,  an  ^-measurable 


60 


EBERHARD  HOPF 


function  ^  can  be  found  that  takes  but  a  finite  number  of  values  and 
satisfies  (5).  If  these  values  are  denoted  by  Xi,  Xi  x*,  and  if 
<Pi,  ifit,  <Pk  designate  the  characteristic  functions  of  the  point  sets 
(C^)  where  ^  —  xi,  0  —  Xi,  . . . ,  ^  —  x*  respectively,  we  obtain 

k 

I 

The  set  where  —  1  is,  now,  a  sum  of  mutually  exclusive  product  sets 
AiX  AtX  ...XAn,Ai  CTi. 

In  other  words,  is  the  sum  of  the  products  of  the  corresponding  char¬ 
acteristic  functions.  Since  the  same  holds  for  . . . ,  0  must  be 

of  the  indicated  form.  The  corollary  follows  from  the  corollary  of  the 
approximation  theorem. 

It  will  be  useful  to  add  a  remark  concerning  the  case  where  —  Q* 
is  the  aymmetric  square  of  a  space  12  i.e.,  the  space  of  the  points 

X  -  (P,  P')  m  (P',  P) 

without  regard  to  the  order.  In  this  case  a  given  symmetric  function 
P(t)  -  P(P,  P')  ■  P(P',  P) 

can  be  appmximated  (in  the  above  sense)  by  finite  sums  of  the  form 

0(t)  -  2  ±/(P)/(P'), 

for  an  approximating  function 

0'(P,  P')  -  2V(P)0(P') 
furnishes  a  symmetric  function  of  the  same  type 

-  \WiP,  n  +  p)\ 

-  m>piP)HP')  +  ^iP')HP)\ 

-  m^piP)  +  HP))  (HP')  +  HP'))  -  HP)HP')  -  HP)HP% 

4.  The  Product  Space  of  Denumerably  Many  Spaces.  It  is  of  impor¬ 
tance  for  our  purposes  to  generalize  the  theory  of  measure  and  integration 
to  the  case  where  0  is  the  product  of  denumerably  many  spaces 

. . . ,  I2_i,  Q_i,  Qo,  12i,  12j,  ... 

the  sequence  being  infinite  in  both  directions.  The  infinity  toward 
the  left  is  a  purely  technical  requirement,  that  will  be  got  rid  of  after- 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


61 


wards.  It  will,  for  our  purpose,  be  perfectly  sufficient  to  consider 
the  simplest  case  where  all  factors  Oi  represent  the  same  space  Q.  A 
shall  denote  a  Borel  subset  of  12.  We  now  regard  the  sequences 

S;  P-t,  P-i,  Po,  Pi,  Pt,  ... 

as  points  of  the  space  12*.  It  is  to  be  emphasized  that  the  order  is 
absolutely  material  and  that  a  point  iS  of  12*  is  not  defined  unless  the 
points  Pi  (the  coordinates)  are  definitely  attached  to  the  indices  t. 
Two  sequences  S,  S'  are  considered  equal  if,  and  only  if.  Pi  P[ 
holds  for  every  i. 

We  start  with  a  finite  and  absolutely  additive  measure  m(A)  defined 
on  the  Borel  field  of  all  Borel  sets  A  of  12, 

(6)  m(12)  -  1. 

Similarly  to  section  3,  we  first  construct  a  field  of  sets  of  points 
iS  on  12*.  W'e  consider  a  subset  of  12*  of  the  form 

(7)  X  X  ...  X  ^.-1  X 

with  given  Borel  sets  ^4  <  on  12,  i.  e.  the  set  of  all  sequences  S  that  satisfy 
the  conditions 

Pi  C  A,;  i  -  -n,  . . .,  0,  . . .,  n, 

while  for  1 1 1  >  n  the  points  are  entirely  arbitrary.  Such  a  set  of 
sequences  S  could  also  be  written  as  an  infinite  product  (n  —  <»),  where, 
however,  all  factors  i4,  sufficiently  far  to  the  left  and  to  the  right  equal 
Q.  The  measure  ^  of  a  set  of  the  form  (7)  will  be  defined  by 

(8)  M  -  m{A.n)  ...  m(i4o)  ...  m{An). 

This  is,  of  course,  a  very  special  way  of  arriving  at  a  measure  on  12*, 
but  we  shall  not,  in  this  paper,  make  use  of  other  measures.* 

A  subset  a  of  Q*  that  can  be  represented  as  a  finite  sum  of  mutually 
exclusive  sets  a.  of  the  form  (7)  shall  be  called  a  normal  set  of  sequences 
S.  The  measure  is  as  usually  defined  by 

(S')  ft{a)  - 

*  Concerning  general  measures  on  tt<*>  see  Kolmogoroff  I.  c.  Ill,  4.  Measures 
in  spares  with  infinitely  many  dimensions  were  first  considered  in  important 
memoirs  by  P.  T.  Daniell,  Annals  of  Math.  19,  270-294  (1918);  20,  281-299  (1919); 
21,208-220(1920). 


62 


EBERHARD  HOPE 


The  fact  that  all  normal  sets  a  form  an  ordinary  field  "9  is  inferred 
in  the  same  way  as  in  the  case  of  an  with  finite  m  (section  3). 
Since  all  field  operations  involve  but  a  finite  number  of  sets  of  the  form 
(7),  the  numbers  n  of  (7)  involved  have  a  finite  maximum  m  and  every¬ 
thing  behaves  like  in  the  space  (2*"'*^'.  In  the  same  way,  it  follows  that 
(80  is  independent  of  the  special  deconoipoeition  of  a  into  a  finite  number 
of  mutually  exclusive  sets  (7),  and  that  n  is  additive  on 
It  is,  finally,  to  be  proved  that  the  measure  m  is  continuous  within 
1^.*  The  continuity  implies  again  the  unique  extension  of  m  onto 
Let 

ai  3  as  3  at  3  . . at  , 

as  in  section  3,  be  a  sequence  of  decreasing  normal  sets  with  the  prop¬ 
erty 

lim  m(o«)  “  X  >  0. 

fl  "  00 

We  prove  again  that  the  On  have  at  least  one  sequence  S  in  common. 
Operating  as  in  section  3,  we  find  a  sequence 

a,  3  i,  3  3  ... 

of  closed  point  sets  (i.e.  closed  in  the  but  finite  number  of  dimensions 
actually  involved)  such  that,  for  all  n, 

6,  C  a„  m(5,)  ^  2 


Here  is,  however,  a  slight  additional  difficulty  since  the  number  of  ^4  <  of 
(7)  involved  in  a„  and  therefore  in  i,,  may  increase  indefinitely.  Let 
us  suppose  that  5]  is  (2ni  -{-  l)-dimensional,  i.  e.  that  61  is  a  set  of  se¬ 
quences  S  where  the  coordinates  Pi  with  1 1  ]  >  ni  are  entirely  arbitrary. 
Likewise  call  2nt  +  1  the  number  of  ‘dimensions’  of  hi,  and  so  forth. 
Without  restriction  of  generality,  we  may  assume  that 


ni  ^  nt  ^  ni  ^  - 


Let  us  select  a  sequence  Si  C.  h\,  Si  C.  hi,  . . ., 

S,:  ....(P-.,'" . PV'l, 

&:  ...,(P-,»> . P.,®], 


*  The  proof  follows  essentially  Kolmognroff,  1.  c.  Ill,  4. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


63 


By  the  diagonal  procedure  we  can  find  a  sequence 
S'.  . . P-i,  Po,  Pi,  ... 

such  that,  along  a  suitable  sequence  of  values  of  s, 

lim 

«  «  « 

for  every  k.  The  sequence  S  is  easily  seen  to  belong  to  all  sets  6i,  6t, 
5a,  . . . ,  c.  q.  e.  d.  An  absolutely  additive  measure  m  is  thus  uniquely 
defined  on  B'f, 

M(a-)  -  1, 

once  the  measure  (8)  is  attached  to  sets  of  form  (7). 

These  facts  involve  a  theory  of  integration  of  B^-summable  functions 
FiS), 


}a’‘FiS)dn. 

In  the  case,  where  F(S)  depends  upon  but  a  finite  number  of  coordinates 

P(P.,Pa,  ...,  P*), 

the  integral  reduces  to 

/o  . . .  laFdmJtnn  . . .  dm*. 

Lemma  g.  For  every  P^-summable  function  P(<S),  and  for  any 
€  >  0  there  exists  a  function 

♦(S)  -  HP.,  ...,  P») 


such  that 


/o«  I  P  —  ^  I  d/i  <  «• 

Corollary.  If  |  P  |  <  M  on  iT,  ^  can  be  chosen  to  satisfy  |  ^  |  <  Af 
-f  1  on  n*“. 

The  proof  follows  from  the  approximation  theorem  as  in  section  3. 

5.  Conservative  Mechanisms.  The  possible  motions  of  a  dynamical 
system,  subjected  to  given  invariable  forces,  can  mathematically  always 
be  described  in  the  following  way.  Let  P  denote  a  phase,  i.e.  the  totality 
of  the  variables  that  describe  completely  the  state  (the  coordinates  that 
describe  the  position  and  the  momenta  giving  the  instantaneous  state 


64 


ERERHARD  HOPE 


of  motion).  After  the  elapee  of  a  time  t  a  phase  P  will  be  shifted  to 
another  definite  phase, 

P,  ~  T,iP) 

An  important  property  of  the  motion  is  that  P  Q  implies  Pt  ^  Qi. 
When  the  forces  are  sufficiently  smooth  functions  of  the  coordinates  and 
momenta,  this  fact  is  nothing  but  the  uniqueness  theorem  of  the  solution 
of  a  differential  system.  It  holds,  however,  in  more  general  eases,  for 
instance,  when  a  die  or  a  coin,  subjected  to  gravity,  bounces  on  an 
elastic  floor.  The  time  independence  of  the  acting  forces  mirrors  itself 
in  the  formula 

(9)  T,T.  -  r,  +  .. 

The  operations  T ,  form  thus  a  linear  one  parameter  group  of  one  to  one 
transformations  within  the  phase  space  Q.  A  perfectly  elastic  coin 
bouncing  on  a  perfectly  elastic  floor  represents  a  conservative  mechanism. 
However,  in  the  case  of  imperfect  elasticity,  for  instance,  with  a  constant 
coefficient  of  elasticity  e  <  1,  the  coin  comes,  as  t  — »  *,  to  lie  on  the 
floor  (dissipative  system).  All  haphazard  games,  roulette,  coin,  die, 
Buffon's  needle,  are  connected  with  a  dissipative  mechanism  as  the  body 
finally  comes  to  rest.  Statistical  phenomena  connected  with  conserv’a- 
tive  mechanisms  are:  The  distribution  of  the  minor  planets,  the  mixing 
of  a  fluid  subjected  to  a  steady  flow,  distribution  of  the  molecules  moving 
in  a  vessel  with  elastic  reflection  at  the  walls,  and  other  phenomena. 

It  was  the  author’s  intention  to  apply  modem  mathematical  tools'to 
statistical  phenomena  effected  by  strictly  causal  mechanisms.  The 
reason  why  the  greater  part  of  this  work  is  devoted  to  conservative 
mechanisms,  lies  in  the  circumstance  that,  at  least  at  present,  the 
necessary  mathematical  tools  are  not  sufficiently  developed  in  other 
directions.  In  spite  of  this  restriction,  chief  attention  will,  however, 
be  given  the  theorems  which,  with  an  appropriate  change  of  formulation, 
are  likely  to  subsist  under  more  general  conditions.  Statistical  phenom¬ 
ena  connected  with  the  roulette  or  with  the  tossing  of  a  coin  can  very 
well  be  put  into  evidence  by  studying  appropriate  consei^-ative 
mechanisms. 

An  essential  property  of  a  conser\’ative  mechanism  is  that  an  inva¬ 
riant  volume  measure  exists  in  phase  space  such  that  the  volume  included 
between  any  two  manifolds  of  constant  energy  is  finite.  This  implies, 
in  a  w’ell  known  way,  the  existence  of  an  invariant  measure  on  each 
manifold  of  constant  energy,  its  volume  being  always  finite.  In  the 


ON  CArSALITY,  STATISTICS  AND  PROBABILITY 


65 


sequel,  the  motion  is  considered  on  a  single  energy  surface  as  well  as  in 
the  whole  phase  space. 

The  mathematical  definition  of  a  con8er\’ative  mechanism  will  be  this. 
The  phase  space  Q  is  a  m-space  (metric,  separable  and  complete,  in  the 
sense  of  Hausdorff).***  This  covers  all  applications.  A  finite  and 
absolutely  additive  measure  m(A)  is  defined  on  the  Borel  sets  on  0. 
This  measure  and  the  ordinary  Borel  measure  on  the  time  axis  (I) 
define  an  absolutely  additive  measure  on  the  product  space  II  X  (0> 
There  is  a  linear  one  parameter  group  of  one  to  one  transformations 
T,{P)  of  n  into  itself,  with  the  properties 

1.  TfT,  «  T i  + 

2.  T i(P)  is  measurable  for  any  fixed  t  and  leaves  the  measure  m 
invariant, 

m{T,{A))  -  »i(A). 

3.  For  any  open  point  set  A  C  il  and  for  any  t  the  set  TtiA),  0  ^ 
I  <  r,  is  measurable  in  fl  X  (0- 

The  invariance  property  of  the  measure  can,  by  means  of  functions, 
be  also  expressed  in  the  form" 

(10)  lof{T,iP))<im  -  Jof(P)dm, 

f{P)  being  summable  on  12.  It  must  also  be  mentioned,  that,  as  a 
consequence  of  the  three  postulates, 

l^{UP))giP)dm 

is  continuous  in  t,  f  being  summable  on  12,  and  g  being  bounded  and 
measurable.'* 

The  following  usual  notation  will  be  used, 

u,g)  ~  JiJgdm,MP)  -fiT,{P)). 

6.  Long  Run  Slalistics.  We  imagine  a  roulette  wheel  which  moves 
without  friction,  or  a  coin  that  is  perfectly  elastically  reflected  at  the 
floor.  Let  us  repeatedly  start  the  wheel  or  toes  the  coin.  After 
elapee  of  t  seconds  (the  same  time  in  all  cases)  the  wheel  is  suddenly 
brought  to  a  standstill,  or  the  position  of  the  coin  is  observed.  For 
reasons  mentioned  above  we  do  not  care  how  much  this  arrangement 

'*  In  this  connection  considered  by  J.  v.  Neumann,  Annals  of  Math.  33  (1932), 
574. 

"  B.  O.  Koopmsn,  Proc.  of  the  Nat.  Ac.  of  Sc.,  17  (1931),  315-318. 

*•  J.  V.  Neumann,  1.  c. 


*.  V 


.1 


66 


EBERHARD  HOPF 


differs  from  the  actual  course  of  the  experiment.  What  is  the  frequency 
with  which  a  definite  sector  appears,  or  with  which  a  definite  side  of  the 
coin  (head,  tail)  is  seen  from  above? 

Generally,  we  consider  a  conserv'ative  mechanism,  start  repeatedly 
with  certain  phases  P  and  observ’e  how  often,  after  elapse  of  the  time  t, 
a  definite  event  occurs,  i.  o.  w.  how  often  the  point  Tt{P)  comes  to  lie 
into  a  definite  part*i4  of  the  phase  space  Q.  Suppose,  in  order  to 
push  the  abstraction  further,  that  we  make  a  continuous  instead  of 
countable  number  of  experiments,  described  by  a  distribution  function 

fiP). 

(11)  W)dm  -(/,»>.) 

denotes  then  the  “number  of  times”  with  which  we  start  within  the 
region  B,  <fiB  being  the  characteristic  function  of  B, 

-  1,  P  C  B,  -  0,  P  C  Q  - 


/(P)  is,  naturally,  supposed  to  be  nowhere  negative  and  summable 
over  12,  the  “total  number”  (/,  1)  of  all  experiments  made  being  finite 
and  positive.  The  “relative  number”  of  experiments  for  which,  after 
the  elapse  of  time  t,  the  event  A  occurs,  equals  evidently 


(12) 


if,  1)  ‘ 


It  is  on  account  of  the  invariance  (10)  of  the  measure,  that,  according 
to ^4  ^(P)  “  <PAiPt),  this  fraction  equals 


(12') 


{f-H,  V>4) 

if,  1)  ‘ 


In  the  case  of  the  (conser\’ative)  coin  problem,  A  being  the  event ‘head 
upward’,  we  should  expect  this  quotient  to  tend  tow'ards  one  half  as 
f  -*  «  independent  of  the  way  of  tossing  the  coin,  i.e.  independent  of  the 
distribution  /(P)  of  the  initial  phases.  It  will  be  convenient  to  have 
an  appropriate  name  for  such  a  phenomenon. 

Definition.  An  event  A  is  statistically  regular  with  respect  to  a  given 
conserv'ative  mechanism  if,  for  any  non-negative  and  summable 
function  /(P),  the  quotient  (12)  tends  tow’ards  the  same  limit 
L{A)  as  <  — »  00, 


(13) 


U,^A-t)-*UA)  (/,!). 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


67 


L{A)  is  called  the  relative  frequency  of  the  event  A.  When  /  is  the 
characteristic  function  of  a  point  set  B  the  statistical  refrularity  of  A 
implies  that 


(14) 


m{BA-i)  m{B,A) 
m{B)  ”  m(B) 


as  <  — independent  of  B,  i.o.w.  the  part  of  B,  that  comes  to  lie 
into  A  after  the  elapse  of  time  t,  has  in  the  long  run  the  measure  L(A)  • 
m(B).  Conversely,  if  (14)  holds  for  any  set  B  of  positive  measure, 
the  event  A  is  statistically  regular  in  the  above  sense.  This  follows 
from  the  fact  that,  for  any  €  >  0,  a  hnite  sum /of  characteristic  functions 
can  be  found  such  that 

1)  <<• 

Since  (14)  implies  (13),  with/ instead  of /,  the  uniformity  of  approxima¬ 
tion  shows  that  (13)  holds  for  any  summable  /.  It  is  even  sufficient  to 
suppose  that  (14)  holds  for  all  'elementary’  regions  B  (parallelepipeds, 
spheres),  since  every  open  set  is  a  (denumerable)  sum  of  such  sets, 
and  since  every  measurable  set  is  the  Durchschnitt  of  denumerably 
many  open  sets  minus  a  set  gf  measure  zero. 

Let  us  denote  by  g(P)  a  nowhere  negative  and  summable  function 
that  is  invariant  under  the  mechanism,  i.e.  that  satisfies,  t  being  arbi¬ 
trarily  given,  the  relation 

gt(P)  -  giP) 


for  almost  all  points  P.  For  an  invariant  function  f  g,  the  left  hand 
side  of  (13)  becomes  independent  of  t,  thus  yielding 


(15) 


L(A) 


Ugdm 

hadm 


If  g  is  the  characteristic  function  of  an  invariant  and  measurable  point 
set  I,  the  right  hand  quotient  becomes 

m(AI) 
mil)  * 


For  statistical  regularity  of  the  event  ^4,  it  is  therefore  necessary  (but 
not  sufficient)  that  any  point  set  which  is  invariant  under  the  mechanism 
have  the  same  fraction  (measured  with  the  invariant  measure  m)  in 
common  with  the  set  A. 


68 


EBERHARD  HOPF 


This  state  of  affairs  can  be  expressed  in  a  more  significant  form  when 
account  is  taken  to  all  other  finite  and  invariant  measures  m'  on  Q. 
We  call  the  two  invariant  (finite)  measures  m  and  m'  comparable  to  each 
other,  if  the  two  equations  m{B)  *  0,  m'{B)  ■■  0  imply  each  other,  m' 
can  always  be  expressed  by  an  integral 


m’{B)  -  lBg{P)dm, 


.■  *  i> 


the  almost  everywhere  positive  weight  function  g  being  invariant  under 
the  mechanism. 

Definition.  We  call  the  quotient 


PU) 


m’{A) 

m'iXt) 


an  a  priori  probability  of  the  event  .4,  if  /I  is  measured  by  a  measure 
m'  that  is  invariant  under  the  mechanLsm  considered. 

In  general,  there  will  be  several  a  priori  probabilities  with  respect  to  a 
given  mechanism.  However,  for  the  existence  of  the  relative  frequency 
of  an  event  A  with  respect  to  a  given  mechanism,  it  is  necessary  (not 
sufficient)  that  the  a  priori  probability  be  independent  of  the  special 
invariant  measure.  We  have,  in  case  L(A)  exists 

HA)  ~  P(A). 

7.  Frequency-  and  Distribution-Phenomena.  C'ausal  mechanisms  can 
produce  long  run  effects  in  two  differently  realizable  ways.  Experi¬ 
ments,  repeated  under  the  same  causal  conditions,  produce  an  event 
with  a  definite  frequency  (frequency  phenomenon)  or  a  large  number  of 
things  subjected  to  the  same  mechanism  at  the  same  time,'*  arrange 
themselves,  in  the  lohg  run,  in  a  definite  distribution  (distribution 
phenomenon).  A  characteristic  example  of  the  second  kind  is  the 
mixing  of  a  fluid  subjected  to  a  steady  flow.  From  the  point  of  view  of 
the  preceding  section,  i.e.  when  continuous  distribution  functions  are 
used  for  the  mathematical  description,  there  is  of  course  no  formal 
distinction  between  the  two  kinds  as  the  special  arrangement  (nachein- 
ander  or  nebeneinander)  does  not  enter  the  formulation  of  the  mathe¬ 
matical  problem.  Actually  the  use  of  distribution  functions  seems  more 
adapted  to  the  study  of  distribution  phenomena.  The  mathematical 
connection  between  the  fiction  of  a  continuous  number  of  experiments 

'*  Enitemlilr  in  the  tenninoloKy  of  Gihbs. 


* 


ON  CAl'SALITY,  STATISTICS  AND  PROBABILITY 


69 


and  actual  sequences  of  experiments  will,  however,  be  studied  in  a 
later  section. 

There  is  a  simple  and  well  known  example  that  ser\'es  as  an  illustration 
for  both  kinds  of  phenomena.  We  consider  the  differential  system 

^  *  W,  ci  *  0, 

ip  being  an  angular  variable  mod.  2t.  The  integration  gives 

P  -  (^,  «),  P,  -  T,{P)  -  (v,  +  0,/,  «). 

This  describes  the  motion  of  a  frictionless  roulette  wheel,  w  being  the 
angular  speed,  or  the  motion  of  a  minor  planet,  <p  being  the  mean  longi¬ 
tude,  and  u  being  the  mean  motion.  The  roulette  is  every  time  imagined 
to  be  stopped  after  t  seconds.  What  is  the  relative  frequency  with 
which  the  indicator  of  the  wheel  comes  to  lie  within  a  given  sector 

A:<f>o  <  <  <P\.  {<Pi  —  <f>o  <  2t) 

or,  what  is  the  long  run  distribution  in  mean  longitude  of  the  minor 
planets?  The  answer  is  easily  found  when  «,  ^(w  >  0)  are  interpreted 
as  polar  coordinates  in  the  plane.  A  finite  and  invariant  measure  is 
evidently  furnished  by 

dm  *  t{(M))d(pdu. 

A  sectorial  element 

Blip'  <  ^  +  Aip,  u'  o}' 

may  serve  as  an  elementary  region  B  in  (14).  W'hen  B  is  imagined  to 
be  filled  with  particles  it  is  geometrically  obvious  that  the  particles  will,  . 
in  the  long  run,  appear  uniformly  distributed  in  direction,  i.e.  that  (14) 
holds,  with** 


LiA) 


~  ifio 
2t 


According  to  (13),  we  generally  have,/  ^  0  and  periodic  in  tp, 


lim 

I  -  ao 


J{ip  —  ut,  u)  (Lpditi 


f{ip,  u)  dtpdu 


•Pi  ~  <Po 
- 1 

2t 


Thu  statement  is  a  special  case  of  a  general  theorem  which  is  proved  in 
section  11. 


70 


EBERHARD  HOPE 


provided  that  the  denominator  is  finite.  As  this  relation  holds  for  any 
sector  i4,  we  find  for  any  bounded  and  measurable  function  periodic 
with  the  period  2r 


Urn 

(  «  go 


j  o(<p)  f(<P  —  iot,  u)  difidti) 

[  r  fi<f, «)  d>pdu 

Jo  Jt 


gdfi 


2r 


Every  sector  of  the  roulette  occurs,  for  a  long  time  interv’al  t  before 
the  stopping,  with  a  definite  frequency,  (v>i  —  ^)/2ir.  The  minor 
planets  are  appn)ximately  uniformly  distributed  in  longitude  (Poincar<^). 

The  mixing  of  a  fluid.  The  transformations  Tt  can  be  imagined  to 
represent  a  steady  flow  in  Q  of  an  incompressible  fluid.  We  consider  a 
con8er\'ative  mechanism,  for  which  every  region  A  C  Q  is  statistically 
regular,  so  that 


(16) 


fnjB,  A)  m(A) 
/I™  m{B)  ~  m(Q) 


holds  for  any  tw’o  regions  A,  B  (consider  that  necessarily  L(A)  =  w(A)/ 
m(U)).  This  means  that  any  part  of  U  will,  in  the  long  run,  be  uniformly 
distributed  over  the  whole  of  12.  Such  a  mechanism  shall  be  called  a 
mixing  mechanism  (mixing  flow). 

When,  in  (16),  A  and  B  are  interchanged,  and  when  account  is  taken 
of  the  invariance  of  m,  (16)  is  seen  to  hold  also  for  2  — »  —  <» . 

For  a  mixing  flow,  the  relation 


lim  (/.,  ^x) 

I  M  00 


(/,  1)  1) 

(1, 1) 


holds,  in  virtue  of  (13),  for  every  region  A  and  for  every  summables 
function  f(P).  This  relation  holds  therefore  for  finite  sums  of  functions 
<Pa  too.  For  a  given  bounded  function  g(P)  and  a  given  e  >  0,  we  can, 
according  to  the  approximation  theorem  (corollary),  always  find  such  a 
sum  ip(P)  such  that 


(k-^lD  \  <^w,lv>l  <M  +  i. 

The  set 


C:  IjOT-vOTI  >  1/2(1,  i/I) 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


71 


has  therefore  a  measure 


m(C)  <  -• 

V 

The  inequality 

1  (ft,  g)  -  (/.,  >p)  1  ^  n  2  +  (2  M  +  1)  jjM  dm 

is  readily  proved.  When,  ri  bein^  kept  fixed,  <  is  chosen  sufficiently 
small,  the  second  term,  2M  +  1  times 

[if, Id.  -  l.Jf\d<«. 

can  be  made  less  than  i;/2,  independent  of  I  (/  |  / 1  dm  is  an  absolutely 
continuous  set  function),  thus  yielding 

I  (/«.  g)  —  (Jh  <p)\  <  V, 

for  all  1.  This  uniformity  of  approximation  shows,  therefore,  that, 
for  a  mixing  flow,  the  relation 

07) 

always  holds  if  g  is  bounded  and  /  is  summable  (or  vice  versa).  The 
lack  of  symmetry  in  the  conditions  for  /,  g  can,  in  view  of  the  applica¬ 
tions  to  frequency  phenomena,  not  be  avoided,  since  the  most  general 
distribution  function  is  a  summable  one,  and  since  an  ‘‘event  function” 
can  always  be  regarded  as  bounded. 

A  conserv’ative  mechanism  T*  is  called  metrically  transitive,'^  if  any 
measurable  subset  of  Q,  invariant  under  T t,  has  either  the  measure  zero 
or  the  measure  m(Q),  in  other  words,  if  Q  cannot  be  divided  into  two 
invariant  sets  of  positive  measure.  A  mixing  mechanism  must  obviously 
be  metrically  transitive  (but  not  conversely)  since  (16),  */,/,■  /, 

A  *  /,  implies 

m(/)m(n)  —  m*(/),  ‘ 

i.e. 

m{l)  m(il  —  /)  ->  0. 

■*  G.  D.  Birkhoff  and  P.  A.  Smith,  Journal  de  Math^matiquea  pures  et  appli¬ 
que,  7  (1928),  p.  345. 


72 


ERERHARD  HOPE 


8.  A  Mixing  Mechanism,  (’onscn-ative  mechanwmH  are  likely  to  be 
metrically  transitive  in  general,  even  to  have  the  mixing  property.  It  is 
a  curiosity  of  the  histor>'  of  mathematics  that  the  exceptional  cases 
have  been  treated  first. 

I>ct  us  start  with  a  conjecture.  Suppose  that  the  conserv’ative  flow 
T ,{P)  on  12  be  metrically  transitive.  On  settinR 

dr  -  f{P)dt 

a  new  time  r  is  introduced  alonR  the  curv’es  of  mot  ion, /(P)  being  positive 
and  continuous  on  12.  This  is  well  known  to  imply  again  a  steady  flow 
St{P)  of  an  incompressible  fluid,  the  invariant  measure  being  ffdm. 
St  is  again  metrically  transitive,  the  streamlines  being  the  same.  It  is, 
however,  likely  that  for  ‘most’  functions  /(P)  the  flow  St  becomes  a 
mixing  flow.  Only  one  exceptional  case  seems  to  exist,  namely,  the 
simple  harmonic  oscillator, 

P:^(mod.  2t);  P,:<f>  +  a2(mod.  2t),  a  =■  const. 

In  all  the  other  cases  it  seems  plausible  that  a  suitable/,  i.e.  a  suitable 
difference  in  speed  along  nearby  motions,  will  produce  complete  mixing. 

A  very  simple  mixing  mechanism  is  suggested  by  the  process  that  is 
employed  by  the  baker  in  making  puff  pastry.  This  mechanism  is  not  a 
flow  but  a  repetition  of  a  single  process  T.  Notions  and  considerations 
of  the  preceding  sections  obviously  apply  to  the  iterations  of  a  single 
one  to  one  transformation  that  leaves  a  measure  m  invariant  (con¬ 
servative  transformation). 

Let  Q:0  ^x<l,  0^y<lbe  the  unit  square  in  the  x-y-plane. 
The  transformation 

’  '  1 

Ti  :  x'  «  A’x,  y'  ~  "i^ry,  N  being  a  given  integer  ^  2, 
mappes  Q  on  the  rectangle 

0  ^  x'  <  N,  0  g  y'  < 

This  rectangle  may  be  cut  into  the  N  rectangles 


ON  CAI  SALITY,  STATISTICS  AND  PROBABILITY 


73 


Our  second  transformation  Tt  consists  in  shifting  these  N  rectangles 
parallel  to  themselves  (turning  about  the  angle  w  is  also  admitted)  until 
they  fill  up  Q  again.  The  discontinuous  transformation 

r  -  r,  r, 


transforms  Q  into  itself  in  a  one  to  one  manner  (with  the  exception, 
perhaps,  of  the  points  of  certain  straight  lines).  T  obviously  preserves 
the  ordinary  plane  measure, 


dxdy,  m{TiA)) 


m{A). 


If  the  N  rectangles  (18)  are  simply  packed  upon  each  other,  in  the 
order  in  which  they  appear  in  (18),  w’e  may  represent  T  by  using  AT-adic 
expansions,'* 


Tix,  y)  -  ix",  y"), 
j  —  0.  aiasUi  . , . ,  x"  «  0.  OtOtat  . . 
j/  -  0.  6, Ml  . . y"  =0-  - 

In  making  puff  pastry,  the  baker  applies  the  following  mixing  process: 
a  lump  of  butter  is  wrapped  up  into  the  dough;  then  the  w’hole  mass  is 
rolled  out  and  folded  together.  This  process  is  repeated  several  times, 
rolling  out  (Tt)  and  folding  together  (Tt),  thus  mixing  dough  and  butter 
in  a  particular  way.  Both  become  finally  distributed  in  very  thin  layers. 
From  this  remark  it  is  intuitively  obvious  that  the  iteration  of  an  above 
transformation  T  (which  is  a  good  idealization  of  the  mixing  process 
employed  by  the  baker)  mixes  Q  completely.  The  mixture  property 
means  that 

(19)  lim  mWr  (B))  -  m  (A)  m  (B) 

W  *■  00 

holds  for  any  two  measurable  point  sets  A,  B,  both  lying  in  Q.  (19)  is 
what  we  are  going  to  prove. 

I.«t  us  denote  by 

Rim,  n) 

'•  The  metric  transitivity  was  proved  by  W.  Seidel,  Proc.  of  the  Nat.  Ac.  of 
Sci.,  19  (1933),  453-456. 


74 


EBERHARD  HOPE 


any  rectangle  such  that  its  projection  upon  the  x-axia  is  an  interval  of 
the  form 

(m  —  ^  X  <  m  “  1,  2,  . . JV" 

and  that  its  projection  upon  the  y-ax\s  is  one  of  the  intervals 

(r  —  1)N~*  ^  y  <  rAT"";  i»  «■  1,  2,  . . N~*. 

I^t  us  call  such  a  rectangle  a  fundamental  rectangle.  Obviously  two 
fundamental  squares  (n  »  m)  have  either  no  inner  point  in  common,  or 
one  is  contained  in  the  other.  A  R{n',  m')  consists,  for  n'  ^  n,  m'  S 
m,  of  precisely  N*  ~  +"-«»'  rectangles  Rin,  m).  Every  measurable 

set  A  can  be  simply  covered  by  a  set  of  fundamental  squares,  the 
measure  of  which  is  as  close  to  m(A)  as  we  wish.  We  readily  infer  from 
this  fact  that  (19)  needs  only  to  be  proved  when  A  and  B  are  funda¬ 
mental  squares.  We  show,  moreover,  that 

(20)  m[R(k',  V)T\R{k,  f))]  -  m(«(ifc',  V)]m[Rik,  1)] 
holds  for  any 

(20*)  n  ^k-\-l\ 

T  is  readily  seen  to  transform  every  R{k,  l),k  ^  1,  into  some  R{k  —  1, 
I  -J-  1).  This  may  be  expressed  (the  subsequent  relation  has  a  similar 
meaning)  by  the  equation 

T(Rik,  D)  •  R(k-  1,1+  1). 

Applying  T  k  times  we  get 

(21)  r*(«(ib,  D)  -  ft(0,  k  + 1). 

We  have 

I 

(22)  Q  -  ft(0,  0)  ~^R,  (f,  0) 

I 

where  the  right  hand  side  denotes  the  sum  of  all  possible  rectangles 
R(i,  0),  N*  in  number.  On  setting 

R,{i,  k  +  D  ~  R,(i,  0)ft(0,  k  +  1), 

we  observe  that  the  left  hand  side  passes,  for  v  *  1,  2  . . . ,  N*,  through 
all  possible  rectangles  R{i,  k  +  1)  contained  in  R{0,  k  1).  We  thus 
find  that 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


76 


S' 


(23) 

Ri0,k-\-l)  -  2  ^^(hk  +  l), 
1 

(23') 

R,ii,  ifc  +  Z)  C  R,ii,  0) 

for 

-  -  1,2,  ...,  N\ 

In  accordance  with  the  funeral  formula  (21)  we  may  set 

(24)  0))  -  «,(0,  i) 

(25)  T'iR.ii,  *+/))-  «,(0,  A:  +  Z  +  0 

The  ri^ht  hand  side  in  (24)  passes  through  all  possible  rectangles  R(0,  i). 
From  (23')  we  have  with  regard  to  (24),  (25), 

(26)  /Z,(0,  k  +  l  +  i)  C.  R.iQ,  i). 

Now,  from  (21),  (23)  and  (25), 

s* 

(27)  T*  +  •  («(*,  D)  -  2  k+l  +  i). 

I 

A  given  Rik',  V)  may  be  written  in  the  form 

(28)  R{k',  V)  =  R{k\  0)R{0,  V) 
with  suitable  right  hand  factors.  Suppose,  now,  that 

i  ^  V. 

R(fi,  V)  is  then  the  sum  of  precisely  N'~‘'  rectangles  /?(0,  i).  From 

(26),  (27)  we  infer  therefore  that 

Rio,  nr*  +  *(/?(*,  D) 

is  the  sum  of  precisely  N*~‘'  rectangles  R(0,  Jb  -f  Z+  *).  Taking  account 
of  (28)  and  of 

Rik',  0)Ri0,  ifc  +  Z  +  0  “  some  Rik',  k  +  I  ■¥  t) 


we  finally  see  that 
(29) 


Rik',  l')T^  +  <(/?(*,  Z)) 


76 


EBERHARD  HOPE 


1 


oonRiHts  of  precisely  N‘~‘'  rectangles  R{k',  I  +  k  +  i).  The  measure 
of  (29)  is  therefore  equal  to 

N*  -  «'  m[Rik’,  k  +  l  +  i)]  ~  N'- 

-  miRik\  V))m{R{k,  /)). 


c.  q.  e.  d. 

9.  The  I ndef)€ndence  Theorem.  It  is  a  common  experience  that 
simultaneous  tossing  of  two  coins  leads  to  the  relative  frequency  ^  for 
every  combination  of  the  faces,  ////,  IIT,.TH,  TT.  The  same  is  true 
when  in  a  secjuence  of  tossings  of  one  coin  two  successive  throws  are 
considered.  When  a  coin  and  a  die  are  thrown  simultaneously,  the  12 
combinations  (//,  1),  (T,  6)  are  known  to  appear  with  equal 

relative  frequency.  Experience  shows  generally,  that  the  simultaneous 
event  A  X  B  occurs  with  the  frequency 

L(A  X  B)  •  L{A)L{B), 

provided  that  L{A)  and  L{B)  exist  with  respect  to  two  mechanisms,  and 
provided  that  the  simultaneous  performance  of  the  two  correspondent 
experiments  is  unrestrictedly,  without  mutual  influence,  possible  at  all. 
This  general  fact  is  commonly  circumscribed  as  independence  of  the  two 
events  A,  B.  The  true  explanation  lies  in  a  simple  as  well  as  funda¬ 
mental  theorem  which  will  be  stated  here  for  conservative  mechanisms. 

Independence  Theorem.  If  an  event  A  is  statistically  regular  with 
respect  to  a  conservative  mechanism,  and  if  the  event  A'  has  the 
same  property  with  respect  to  another  such  mechanism,  the 
simultaneous  event  A  X  A*  is  always  statistically  regular  with 
respect  to  the  resulting  pnxluct  mechanism,  and  its  relative 
frequency  equals  , 


L{A  X  A')  ^  LiA)L(A'). 

Proof.  A  represents  a  region  in  the  phase  space  U  of  the  first  mech¬ 
anism  T ,{P),  with  the  invariant  measure  m.  The  statistical  regularity 
of  A  implies  the  relation  (13)  with  a  definite  L{A)  end  with  an  arbitrary 
function  f{P)  ^  0  summable  over  il.  Similarly,  let  A'  be  statistically 
regular  with  respect  to  T/(P')  in  m'  being  a  corresponding  invariant 
measure.  This  implies  an  analogous  equation 

(30)  {r,<PA'-,)-L{A>){r,\) 


f 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY  77 

as  <  — *  *,  where /'(P')  ^  0  is  an  arbitrary  function  summable  over  Q' 
and  where,  of  course,  0'  stands  for  II,  P'  for  P,  T/  for  Pt  and  dm’  for  dm. 
We  have,  now,  to  consider  the  two  mechanisms  simultaneously,  i.e. 
the  product  mechanism, 

T  -  (P,  PO,  s,(r)  «  (r,(P),  P/(P')),  T  c  n  X 

An  invariant  measure  n  is  given  by 

(31)  dfi  “  dmdm’. 

The  characteristic  function  of  the  simultaneous  event  A  X  A'  is 
d>ir)  -  <pAP)<pAP'), 

which  implies 

<fiAAP)<PA'.,{P')  -  MT,(P))<pAT,’iPl)  “  ASM)  ~ 

On  setting,  generally, 

(32)  <F,G>  •  [  P(x)G(t)  d^ 
and,  in  particular, 

(33)  p(t)  -  np)r{p’), 
we  find  by  multiplying  (13)  with  (30), 

<  P, «!»,  >  —  L(A)  L(A')  <  P,  1  >, 

<  — ►  00.  This  holds  also  for  sums  of  functions  (33)  and  therefore, 
according  to  lemma  1,  for  any  function  F{ir)  summable  over  tl  X  12'  i.e. 
A  X  A'  is  statistically  regular  with  respect  to  S,{ir). 

The  same  reasoning  proves  the  generalization  to  n  events, 

L(A,  X  A,  X  ...  X  A,)  -  MA,)L(A,)  ...  L(A,), 

under  the  conditions  formulated.  The  theorem  could  also  easily  be 
extended  to  much  more  general  mechanisms. 

It  follows  from  the  independence  theorem  that  the  product  of  two 
mixing  mechanisms  represents  again  a  mixing  mechanism.  The  product 
is,  in  particular,  metrically  transitive.  The  metric  transitivity  of  a 
flow  does,  as  mentioned,  not  imply  the  mixing  property.  It  will, 


78 


EBERHARD  HOPE 


however,  be  ehown  later  that  the  metric  transitivity  of  the  double  flow 
(product  with  itself)  ‘nearly'  implies  mixture. 

Application  to  Buffon'*  Needle  Experiment.  The  conditions  may  be 
idealized  and  simplified  in  the  following  way.  The  needle  moves 
without  friction  in  a  plane  on  which  infinitely  many  equidistant  lines  are 
drawn.  Kvery  time,  the  needle  is  stopped  after  elapse  of  time  t.  If  d 
is  the  distance  between  the  lines  and  if  j  denotes  the  coordinate  per¬ 
pendicular  to  the  lines  of  the  center  of  gravity  of  the  needle  (or  any 
plane  figure),  ip  being  the  angle  with  the  lines,  the  equations  of  motion 
are 

^  I  (mod.  d); ^  ^  (mod.  2t), 


1.  o.  w.  this  is  the  product  of  two  roulette  models  (section  7).  The 
relative  frequency  with  which  the  needle  falls  into  the  region 

x'  <  X  <  x'  +  <^x,  ip'  <  tp  <  <p'  Aip, 


exists  therefore  and  equals 


Lx  Lap 


Ax  Xp 
d  2r 


The  reader  will  find  this  pn)blem  resumed  in  section  15. 

10.  The  Coin  Problem  and  the  Die  Box.  The  motion  of  a  rigid  body 
under  given  conserx  ative  forces,  together  with  elastic  reflection  at  a  given 
surface,  can  presumably  be  reduced  to  a  geodesics’  problem  on  a  mani¬ 
fold  of  dimensions  ^  6  with  elastic  reflection  of  the  moving  point  at  some 
parts  of  the  boundary.”  In  order  to  put  the  problems  concerning  us 
into  evidence  we  consider  the  simplest  case  of  the  motion  of  a  two- 
dimensional  rigid  body  in  a  plane.  The  coin,  for  instance,  is  to  be 
'  replaced  by  a  needle,  and  the  die  by  a  square,  both  being  elastically 
reflected  at  a  straight  line  (Fig.  1).  We  restrict  ourselves  to  the  case 
where  the  btxly  is  subjected  to  no  force  or  to  the  gravity  (acting  in  the 
direction  of  —  x).  Since  in  these  cases  the  y-component  of  the  center  of 
gravity  always  moves  with  uniform  speed,  we  may,  without  loss  of 
generality,  assume  C  to  stay  on  the  x-axis. 


When  the  momentary  forces  acting  on  the  surface  are  replaced  by  a  strong, 
hut  finite  held  of  conservative  forces,  the  problem  is  well  known  to  he  equivalent 
to  a  geodesics  problem.  When,  afterwards,  the  field  concentrates  again  on  the 
surface,  some  parts  of  the  manifold  will  fold  themselves  together. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


79 


R  is  suppoeed  to  be  a  fixed  point  of  the  body.  ^  denotes  the  angle 
which  the  moving  ray  CR  makes  with  the  direction  +x.  The  distance 
()C,  taken  when  the  body  touches  the  v-axis,  is  a  definite  function 
(StUtzfunktion)  that  characterises  the  shape  of  the  convex  body. 
The  distance  OT  is  easily  found  to  be/'(^). 

The  equations  of  motion  arc  (without  collision) 

(34)  ^  -  w,  i  »  u, 

[O,  no  force, 
w  *  0,  ii  “  I 

[-P,  gravity. 

When  the  body  collides  with  the  y-axis  at  T,  the  moment  about  T  of  the 
body'* 

(35)  Iw  -I-  MfMu, 

remains  unchanged  after  the  collision  (this  is  true,  whether  the  collision 
is  elastic  or  not).  The  normal  component  of  the  velocity  of  ^(regarded 


as  a  point  of  the  body), 

(36)  u  -  wf'M, 

however,  changes  sign  in  the  case  of  elastic  collision  (gets  multiplied  by 
—  e  in  the  case  of  imperfect  elasticity).  The  position  of  the  body 
obviously  obeys  the  inequality, 

X 

the  equality  sign  characterising  the  collision. 

W’e  consider  two  cases,  first:  Shaking  a  die  in  a  box.  Idealizing  as 

>*  I  is  the  moment  of  inertia  about  C. 


80 


ERERHARD  HOPE 


far  as  possible,  we  fnve  the  box  the  shape  of  an  infinite  parallel  strip  of 
width  W.  Furthermore,  we  keep  the  box  at  rest  and  let  the  die  (square) 
move  without  external  forces  between  the  parallel  walls.  We  have  then 
the  additional  inequality 

X  ^  W  +  X-), 


>'  ‘1 


the  equality  sign  being  characteristic  for  the  collision  with  the  other 
line.  In  the  latter  case,  ^  must  be  replaced  by  ^  +  t  in  (35)  and  (36). 
When  we  now  set 


the  equations  of  motion  (34)  become 


(37) 


7i  ”  Pu  ft 
Pi  “  0,  p. 


Pi 

(  0,  no  force, 
\-g,  gravity, 


while  the  conditions  of  the  collision  mean  that 


Pi  +  /j/y/'(ft)  Pi 
remained  unchanged,  while 

Pi  -  /|/y 

becomes  multiplied  by  —  e(— 1  in  case  of  perfect  elasticity).  This 
applies  to  collision  with  the  lower  wall 

ft  -  /j/y/(ft), 

while  at  the  upper  wall  qi  is  to  be  replaced  by  71  +  t  in  all  conditions 
for  the  collision.  The  geometrical  interpretation  is  that  the  point 


ON  CAl’SAMTY,  HTATIHTICS  AND  PROBABILITY 


81 


(</i,  qt)  moves  in  the  7i-</^plane  iK'lwi^en  the  peritKlic  curves  (qt  has 
the  |M‘rin(t  2r) 

m  |/  jM  ^  ^  |/y  (W'  -  /(7.  + 

with  reflection  at  these  curves,  the  modulus  of  ehisticity  e  bciuK  the  same. 
In  particular,  perfectly  elastic  reflection  (c  »  1)  of  the  IsKly  corresptmds 
to  perfectly  elastic  reflection  of  the  p<»int  at  those  curves.  In  the  present 
box  pniblem  we  neglect  external  forces  {g  =*  t))  so  that  the  point  (^i,  </*) 
moves  uniformly  and  rectilinearily  with  jx^rfect  reflection  at  the  Ixmnd- 
ary  of  (88),  'I'he  case  of  a  hoinoKeneous  scpiare  is  illustratetl  in  Fig.  2. 

V^3  hiqi)  <  qt  <  C  -  /»(7i)i 

hiqi)  *  max  (  ]  cos  f/,  |  ,  |  sin  q,  |  ). 


C  depends  upon  the  ‘width’  of  the  ‘box’  relative  to  the  dimensions  of  the 
‘die’.  Since  71  is  only  defined  mod.  2ir  the  surface  between  the  above 
curves  is  developable  on  a  part  of  a  cylinder.  Kach  boundary  curv’c 
consists  of  four  congruent  pieces  which  can  lx?  interpreted  as  the  inter¬ 
sections  with  certain  planes.  C’ollision  with  one  of  these  pieces  corre¬ 
sponds  to  the  collision  of  a  definite  corner  of  the  square.  The  whole 
problem  reduces  thus  to  the  geodesics  problem  on  that  part  of  the 
cylinder,  with  perfect  reflexion  at  the  boundaries. 

The  purpose  of  shaking  a  die  in  a  box  is  to  let  ‘congruent’  phases 

(qi  +  •'2’  ^)’  ^  2,  3 

appear  with  the  same  frequency.  This  would  imply  equal  frequency  of 
the  four  sides  when  the  square  (die)  is  dropped  and  brought  to  rest.  In  a 
reflexion  problem  without  external  forces  the  geometric  path  of  motion 


82 


EBERHARD  HOPE 


is  the  same  for  all  velocities  V'  *  \/pi*  +  />**.  With  regard  to  the 
stationarity  theorem  (section  11),  the  above  conjecture  (in  our  special 
case)  would  be  equivalent  to  the 

Problem.  It  is  to  be  shown  that  in  the  above  reflexion  problem 
(U  »  (qi,  qt,  6),  d  being  the  direction  of  the  path)  any  measurable 
phase  function  F(qi,  qt,  d)  that  is  invariant  under  the  motion,  is  also 

invariant  under  the  ‘congruence’  transformations  9/  “  71  + 

'J'he  above  reflexion  problem  is,  moreover,  likely  to  be  metrically 
transitive  (every  invariant  function  is  equivalent  to  a  constant).  The 
concavity  of  the  boundary  pieces  implies  that  the  extremal  joining  two 
given  points  is  smaller  than  any  other  joining  line  of  equal  topological 
type.  The  other  end  of  a  long  extremal  is  furthermore  extremely  sensi¬ 
tive  to  changes  of  the  initial  phase. 

The  two-dimensional  analog  of  the  coin  pniblem  is  a  homogeneous 
needle  in  a  perpendicular  plane  subjected  to  gravity  and  bouncing 
elastically  on  a  horizontal  line  (floor).  This  is  equivalent  to  the  motion 
of  a  point  (71,  qt),  71  +  2t  *  71  subjected  to  gravity,  with  elastic  reflexion 
on  the  curve, 

7*  *  v/3  1  cos  7,  |. 

There  is,  of  course,  an  analogous  problem. 

As  long  as  the  reflexion  is  elastic,  the  mechanism  is  con8er\’ative, 
the  measure 

fiqi  dqt  dpi  dpt 

being  invariant.  In  the  case  e  <  1  the  system  is  dissipative  and  the 
needle  finally  (<  — »  00)  comes  to  rest  (7*  *  0).  The  above  element  of 
volume  contracts  in  the  ratio  after  a  collision.  If  the  needle  is 
inhomogeneous,  the  center  of  gravity  lying  exoentrically,  the  curve 
where  the  point  bounces  is  again  a  cosine-curve,  however  with  alter¬ 
nating  amplitude  in  the  alternating  interv'als  (nir,  (n  -f  l)r).  It  is  of 
interest  to  find  the  frequencies  of  the  two  final  positions  (frequencies 
of  the  six  sides  of  an  inhomogeneous  die). 

11.  The  Ergodic  Theory.  In  order  to  go  deeper  into  the  problems 
of  frequency  and  distribution  phenomena,  at  least  as  concerns  con- 
sei^’ative  mechanisms,  we  must  make  use  of  a  fundamental  theorem 
according  to  which  any  phase  function  has  a  time  average  along  the 
cur\'e8  of  motion.  This  was  first  proved  by  v.  Neumann**  and  Carle- 

'*  J.  V.  Neumann,  Pror.  of  the  Nat.  Ac.  of  Sc.  18  (1932),  70-82.  An  elementary 
proof  in  found  in  the  author’s  paper  referred  to  subsequently. 


A 


ON  CAI’SALFTY,  .STATISTICS  AND  PROBABILITY 


83 


man*®  in  the  sense  of  mean  convergence,  and  by  Birkhoff*'  in  the  sense  of 
actual  convergence  almost  everywhere  in  phase  space.  The  cure  of 
Birkhoff ’s  result  concerns  a  single  one  to  one  transformation  T  of  12  into 
itself  which  leaves  a  finite  measure  m  invariant. 

Ergodic  Theorem.*^  For  any  summable  function  g(P)  the  limit 


»  - 1 

lim  -  2  giT'iP)) 


exists  in  all  points  P  of  12  except  in  a  set  of  m-measure  zero.  The 
limit  function  is  again  summable  over  12. 

A  trivial  but  useful  consequence  is  that 


(39) 


lim  -A(7’-(/’)) 

■  M  90  ^ 


0 


holds  in  almost  all  points  P,  h  being  an  arbitrary  summable  function. 
We  now  consider  a  consei^  ative  flow  T,iP)  in  12  with  the  finite  invariant 
measure  m.  According  to  postulate  c)  in  section  .5 


/i(P)  -  f(T,iP)) 


is  a  measurable  function  of  (P,  2),  if  /(P)  is  a  measurable  function  of 
the  point  P. 

If  /(P)  is  summable  over  12,  so  is 

for  almost  all  t.  We  state  Birkhoff’s 

Time- Average  Theorem.  For  any  summable  function  /(P),  the 
time-average 

(40)  /•(P)  -  Hm  -  rMP)dt 

r  "  •  Jo 

exists  in  almost  all  points  of  12  and  represents  again  a  summable 
function. 


*•  T.  Carleman,  ApU  .Math.  59  (1932),  63-87. 

**  G.  D.  Birkhoff,  Proc.  of  the  Nat.  Ac.  of  8c.  17  (1931),  650-660. 

8ee  also  J.  v.  Neumann,  Annals  of  Math.  33  (1932),  587-642. 

This  simplified  formulation  of  Birkhoff’s  result  was  mentioned  by  the  author, 
Pmc.  of  the  Natl.  Ac.  of  8c.  18  (1932),  93-100  (bounded  F).  For  summable  /  see 
A.  Khintchine,  Math.  Annalen  107  (1933),  485,  where  the  reader  finds  a  simplified 
representation  of  Birkhoff’s  pnM>f. 


84 


EHKRHARD  HOPF 


Prtmf}'  Thp  funclkm 

giP)  - 

is  sumtnahio  over  ii.  On  sotting  T»n  +  r,  0  ^r<l,n  being  an 
integer,  T  *  Ti,  we  obtain 

-  ['Mndt  -  •  -  y  giT-iP))  +  -  [' MT’>{P))dt. 

^  Jo  ^  ”  0  ’’ 

To  the  firnt  femi  we  may  apply  the  orgtKlic  theorem.  The  second  term 
is,  in  abstdiite  value,  not  grt'aler  than 

j;  h(THP)), 

where 

h{P)  -  l^'\MP)\dt 

is  summable  over  W.  (39)  therefore  completes  the  pr(K)f. 

(reneralization.  Under  the  hypothesis  of  the  time-avenigc  theorem, 

(41)  lim  -  /  e~'^f,iP)dl 

r-»  r  Jo 

X  being  a  real  number,  exists  in  almost  all  points  P. 

Proof,  VN'e  set 

T  -  Tt,,g(P)  -  r  e-^^MP)dt. 

T  Jo 

$ 

g(P)  is  then  summable  over  12.  The  pr(K>f  follows  from  the  fact  that 
the  above  expression  (41)  equals  (t  =  2Tn/X  +  r,  r  bounded) 

r  '  I  2  1^(7’'(P))  +  -  f' e-^^MTHP))dl. 

T  n  T  Jo 

The  limit  (41)  represents  a  summable  function  0(7*),  which  is  easily 
seen  to  satisfy  the  relation 

4>,iP)  -  e'^^(P),  • 


”  See  the  author’s  note  1.  c. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


85 


for  almost  all  P,  when  t  is  (arbitrarily)  fixed.  The  imafdnary  part 
^(P)  of  loR  ^(P)  (mod.  2t)  represents  an  ‘angular  variable’  of  the 
mechanism,**  ^i(P)  —  ^(P)  +  XZ  (mod.  2*-).  X  is  called  a  proper  value 
of  the  mechanism,  if  there  is  a  corresponding  0  «  0.  The  proper  values 
constitute  the  point  spectrum  of  the  mechanism.  The  limit  (40),  i.e. 

(41)  in  the  case  X  «  0  represents  always  a  function  that  is  invariant 
under  the  flow  (‘integral’  of  the  mechanical  system).  The  ‘integrals’* 
are  only  connected  with  the  curves  of  motion,  while  the  ‘angular 
variables’  have  essential  bearing  upon  the  time  in  which  these  curves 
are  passed  through. 

When  /  is  the  characteristic  function  of  a  point  set  A ,  (40)  represents 
the  mean  sojourn  in  A  of  the  wandering  point  T,{P).  If,  for  any 
elementary  region  A  (sphere,  parallelepiped),  almost  all  points  have  the 
same  mean  sojourn  in  A,  the  mechanism  is  necessarily  metrically  tran¬ 
sitive.**  Conversely,  metric  transitivity  implies  that  the  time  average 
is  always  constant  (except  in  a  set  of  measure  zero).  In  this  case  the 
time-average  equals 

^  (1,1) 

For  later  purposes  we -need  the 

Lemma  3.  For  any  summable  /(P),  the  convergence  of 

f^\p)  -  -  r/.(m 

*■  J* 

towards  the  time-average  /•(P)  is  strong. 

(42)  (!/'* -/•!,  l)-^0 

as T  — *  00. 

Proof.  It  must  be  remarked  that  the  functions  |/'’|  will  not  lie 
below  a  fixed  summable  function.  The  set  A^^^  where  the  integrand  in 

(42)  is  less  than  a  given  c  >  0  is  well  known  to  have  the  property 

(43)  lim  m(n  -  X<'’)  -  0. 

r  •  00 

We  have 

(44)  ( -  /•  1 .  1)<  .m(tt)  -H  ( I/*  1 ,  V,)  -F  ( I/')  I ,  v>) 

'*  B.  O.  Koopman,  I.  c. 

V.  Neumann,  I.  c. 


86 


EBERHARD  HOPF 


iP  beinK  the  characteristic  function  of  the  above  set  AcoordinK  to 
the  absolute  continuity  of  the  indefinite  integral  S  \  P\  dm,  and  in 
virtue  of  (43),  the  second  term  in  (44)'tends  to  zero.  As  to  the  third 
term,  we  have 


(I/”!,.-)  f'lMdt 

^  Jo 


and 


(»'”,,>)  -  ^jf'( I/I. »-,)<«. 


This  converges  to  zero,  since  ^  is  the  characteristic  function  of  a  set 
of  measure  (independent  of  t) 


m(il  - 


and  since,  therefore. 


(I/U-.) 


can  be  made  uniformly  small,  0  <  t  <  t,  which  completes  the  proof. 
Simple  consequences  of  the  lemma  are  the  following 


lim  -  f  (/,,  ff)dt  -  (/*,  g), 

rmaa  T  Jq 


f  Ix^ing  summablc  and  g  being  bounded.  When  A  is  a  bounded  invariant 
function,  ht  —  A,  this  implies  according  to 


the  relation 


(/..  h)  -  (/„  A.)  -  (/,  A) 

I 

(/,  A)  -  (r.  A) 


which  determines/*.  From 

d/-*' -r\.  1)  -  (|/"■■’ -r\.  I),/-'’  ~ 


we  infer  that/*  is  also  the  time-average  in  the  past.  We  note  that,  for 
arbitrary  /  and  g, 


(r,g)  -  if.  9*)  “  (/*,»•). 


jirM 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


87 


We  conclude  this  section  with  a  simple  and  interesting  application  to 
distribution  phenomena.  Suppose  that  in  a  dynamical  system 


-  //  -  L  +  Q 

dp. 


the  kinetic  energy  is  a  quadratic  form  of  the  momenta  p*  (with  constant 
coefficients)  and  that  the  potential  energy  Q  is  a  homogeneous  function 
of  degree  a  ^  2  of  the  coordinates  q,.  The  invariant  manifolds  of 
constant  energy 

H  const. 

are  supposed  to  be  closed. 

The  transformations 


p/  -  M“P<,  t'  - 

obviously  leave  the  equations  of  motion  invariant.  It  appears  from  this 
that  the  motion  is  essentially  the  same  on  all  level  surfaces  //  »  const, 
with  a  time  scale  that  varies  from  surface  to  surface. 

A  more  striking  example  for  this  behavior  is  furnished  by  the  rectilin¬ 
ear  and  uniform  motion  of  a  particle  in  a  closed  vessel  with  perfectly 
reflecting  walls.  The  curve  of  motion  is  independent  of  the  speed,  in 
other  words,  the  flow  is  the  same  on  all  level  surfaces  (speed  »  const.) 
in  n,  but  the  time  scale  varies  (inversely  proportional  to  the  speed, 
case  a  »  00  of  the  former  example).  Finally,  the  simple  example  of  the 
frictionless  roulette  wheel  (section  7)  should  not  be  forgotten. 

The  general  situation  in  these  examples  is  the  following.  There  is  a 
one  parameter  family  of  manifolds  Ailing  the  phase  space  Q  and  being 
invariant  under  the  conservative  flow  T,{P).  The  element  of  invariant 
measure  m  on  12  furnishes  in  a  well  known  way  an  invariant  and  Anite 
measure  a  on  the  level  surfaces.  We  pick  out  one  (Z)  of  these  surfaces 
and  denote  by  t  the  points  of  Z  and  by  <S<(t)  the  flow  on  12.  If  X 
signiAes  a  suitable  parameter  of  the  above  family,  the  flow'  T,{P)  in  Q 
is  of  the  following  kind, 

(45)  P  -  (t,  X),  T,{P)  »  (Sx»(x),  \);dm  «  <fi{\)dXde, 

^(X)  being  a  positive  and  continuous  function  of  X.  (45)  means  that 
the  flow  is  geometrically  the  same  on  all  surfaces  X  const.,  but  that  its 
rapidity  changes  with  X. 


88 


EBERHARD  HOPE 


We  prove  then  the 

Statumarity  Theorem.  An  arbitrary  distribution  (ensemble  in  Gibbs’ 
terminology)  in  il,  subiected  to  a  mechanism  of  the  type  (45),  tends 
to  distribute  itself  in  a  stationary  way  as  /  — »  « 

Ck)nceming  the  second  example,  this  means 

A  large  number  of  particles  of  various**  speeds  moving  without 
external  external  forces  within  perfectly  reflecting  walls  assumes  a 
steady  distribution  in  the  long  run. 

Proof.  It  is  to  be  proved  that 

(46)  lim  (/„  g) 

I  ■  oe 


exists,  f(P)  being  summable  and  giP)  being  bounded  on  the  phase 
space  n.  We  can  write  /  —  f(r,  X),  </  —  g(ir,  X),  fl  being  the  product 
space  of  £  and  a  finite  X-interv’al.  According  to  lemma  1,  it  is  sufficient 
to  show  the  existence  of  the  limit  when  g  is  of  the  form 


(47) 


/  .  v  .  fg{r),  X  in  (Xi,  X,), 


giy)  being  bounded  and  measurable  on  Z.  In  order  to  prove  it  in  this 
case,  we  may  restrict  our  attention  to  the  case  where  /  is  of  the  form 


(48) 


t!  jf  X  in  (Xo,  Xj), 


fir)  being  summable  over  2.  It  implies  no  loss  of  generality  to  take 
the  same  X-interv’als  in  (47)  and  (48)  since  only  the  common  part  of  the 
two  inter\'als  enters  the  integral  (J,,  g).  The  latter  integral  can  now  be 
written 

(/«» g)  *  jz  { /  d\^g(w)dc. 


Again,  there  is  no  loss  of  generality  in  assuming  that  ^(X)  is  piecewise 
constant,  in  particular,  that 

ifi  (x)  ■■  1,  Xo  ^  X  <  Xi, 


We  now  have 


fxMdX 


**  8tationary=-in variant  under  the  flow. 

**  This  ia  absolutely  essential.  When  all  particles  have  exactly  equal  speed 
the  theorem  need  not  be  true. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY  89 


This  tends,  in  virtue  of  the  time-average  theorem,  towards  a  limit 
function.  The  limit  (46)  exists  therefore,  as  f  — ►  « ,  c.  q.  e.  d. 

The  limit  (46)  necessarily  equals  (/*,  ff),/*  being  the  time-average  of  /. 
When  the  flow  is  metrically  transitive  on  the  level  surfaces,  the  long  run 
distribution  is  seen  to  depend  upon  X  only.  This  is  the  case  in  the 
example  considered  in  section  5  since  the  flow 

P  «  ^(mod.  2ir),  P|  *  ^  I(mod.  2r) 
is  metrically  transitive. 

12.  Mixture  and  Statielic  Regularity.  According  to  section  7,  a 
conBer\’ative  mechanism  TtiP)  is  a  mixing  mechanism  if,  and  only  if 


(49) 


lim  (f„g) 

I  m  90 


(/,  l)(g,  1) 

1,1 


holds  for  any  two  bounded  functions  /,  g*'  (general  tendency  towards 
uniform  mixture).  Some  reasons  compel  us  to  consider  besides  (49)  a 
slightly  wider  definition  of  tendency  (convergence).  A  mixing  mech¬ 
anism  (in  the  wider  sense)  is  characterized  by  the  condition  that 


(50) 


(/.  l)(g,  1) 
(1,  1) 


dt 


shall  hold  for  any/,  g. 

Tendency  in  this  sense  is  easily  seen  to  be  equivalent  with  the  follow¬ 
ing  statement:  There  exists  a  ‘dense’  set  {f }  on  the  time  axis 


lim 

0-m 


meas ((<}, a  <t  <0) 
0  —  a 


1, 


such  that  (49)  holds  for  any  /,  g  provided  that  /  — »  «  along  {f}.** 

From  the  point  of  view  of  most  applications,  (50)  is  hardly  less 
important  than  (49).  Moreover,  general  criterions  for  (50)  are  far 
more  easily  obtained  than  for  (49)  and  represent  directly  what  we 
would  expect. 

Mixture  Theorem.**  The  necessary  and  sufficient  condition  that  a 
conservative  mechanism  possess  the  mixture  property  (in  the  wider 
sense)  is  that  it  be  metrically  transitive  and  that  it  have  no  angular 
variables,  i.e.  that  X  —  0  be  the  only  and  a  simple  proper  value. 

It  follows  then  also  for  summsble /. 

*'  B.  O.  Koopman  and  J.  v.  Neumann,  Proc.  of  the  Nat.  Ac.  of  Sc.  18  (1032),  255. 

'*  E.  Hopf,  Proc.  of  the  Nat.  Ac.  of  Sc.  18  (1032),  204-200,  See  also  Koopman 
and  V.  Neumann,  I.  c. 


90 


EBERHARD  HOPE 


Proof.  The  necessity  of  the  metric  transivity  follows  similarly  as 
in  the  case  (49).  The  nonexistence  of  angular  variables  is  seen  in  the 
following  way.  The  validity  of  (50)  for  complex  f,  g  follows  easily,  once 
(50)  is  true  for  real/,  g.  A  proper  function  ^  must,  now,  be  orthogonal 
to  every  invariant  function  h,h,  —  h,  since  for  every  t 

(^.  h)  -  (^„  k,)  ~  (^.,  h)  -  (^,  h). 

In  particular,  (^,  1)  ■>  0.  On  setting  /  ^  4^,g  ^  ^\n  (50)  we  find  ^  0. 

This  supposes  that  ^  is  bounded.  If  this  is  not  true,  we  take  ^/(l 
-t-  I  0  I)  which  is  a  bounded  proper  function. 

Suppose,  now,  that  the  flow  T,  be  metrically  transitive  and  that  no 
proper  function  (X  9^  0)  exists.  It  then  follows  that,  for  X  0,  the 
limit  (41)  is  always  zero.  On  setting 

0‘^’(P)  •I  j\-^MP)dt 

we  obtain,  in  perfect  analogy  to  lemma  3, 

lim  (l^'^’I,  1)  -  0. 

T  •  80 

This  implies  (including  the  case  X  —  0), 

(61)  lim  —  f  e-^  l(f„g)  -  (f*,g)}  di  -  0, 

for  any  X,  and  for  any  /,  g.  (50)  is  to  be  inferred  from  (51).  For  that 
purpose  we  make  use  of  a  theorem  of  S.  Bochner.*®  A  necessary  and 
sufficient  condition  that  a  continuous  and  bounded  function  F(t),  —  oo 
<  t  <  oc ,  be  representable  by  a  Fourier-Stieltjes  integral, 

'  Fit)  *  j  e’^'dpiu), 

p(u)  being  bounded  and  nowhere  decreasing,  is  that  the  ‘quadratic  from’ 
(52)  //  ipis)  ipiO  Fit  —  «)  dadt 

be  nonnegative,  ^  being  any  bounded  function  which  vanishes  outside  a 
sufficiently  large  interval. 

*•  S.  Bochner,  Vorlesungen  liber  Fourier  Integrate,  lieipzig  1932,  76.  The 
applicability  to  the  mixture  theorem  was  mentioned  by  A.  Khintchine,  Proc.  of 
the  Nat.  Ac.  of  Sc.  16  (1933),  567. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


91 


This  condition  is  fulfilled  in  the  case*‘ 

^(0  -  (/.,/) 

since  F{t  —  «)  —  (/i,/.)  and  since,  therefore,  (52)  equals 

(/  >PStdt,  f<pftd»)  ^  0. 

The  function 

Oit)  -  (/„  g)  -  (T.  9) 

can,  now,  be  obtained  by  adding  and  subtracting  functions  of  the  above 
form  F{t),  and  admits  therefore  of  the  representation 

(53)  G(0  -  j  e'“'d9(u), 

q{u)  being  of  bounded  variation  in  (—  «,  «).  q(u)  must,  in  virtue  of 
(51),  be  continuous  everywhere,  because  otherwise  G(t)  would  contain  a 
term  ce^  for  which  (51),  X  *  a,  would  not  vanish.  This  applies  to  the 
case  a  “  0  too.  The  continuity  of  q  in  (53)  implies” 


c.  q.  e.  d. 

We  shall  now  appmach  the  same  and  related  problems  from  another, 
simpler  angle,  by  considering  the  product  mechanism  in  the  symmetric 
product  space  0*, 

T  -  (P,  Q)  »  (Q,  P);  r,  -  (P.,  Q,). 

The  same  notation,  in  particular  <  . . .  >  for  the  scalar  product  in  ii*  is 
used.  The  equation 

(54)  lim  f  -  (P,g)\*dt  ^  0 

is,  in  virtue  of  * 

lim  (  {ft,  g)  dt  -  (/*,  g) 

ft  —  a  •  OB  P  ®  Ja 

*'  i.e.  romplex  /  are  admitted. 

”  E.  Hopf,  Sitzungsber.  Berl.  Akademic,  1932,  XIV. 


92 


EBERHARD  HOPF 


equivalent  to  the  equation 

(55) 

ft  —  a  ^  »  P  ®  Ja 

On  setting 

(56) 

f'M  -  CM  -  ?(/')»(«) 

and 

(56') 

F(.)  -r(i’>/'(Q) 

we  find 

<F„G>  ~  if„  g)\  <P,G>  ~  if*,  gy, 

(55)  can  thus  be  written  in  the  form 
1 

Um  - -  /  <  F,,G  >  dt  <  P,G  >. 

t  —  a  •  <o  P  ®  Jet 

On  the  other  hand,  the  limit  on  the  left  exists  always  and  equals 


<  F*,  G  > 

F*  being  the  time-average  of  F.  (54)  is  therefore  perfectly  equivalent 
with  the  equation 

(57)  <  F*  -  P,G  >  -  0, 

where  F,  P,  G  are  defined  by  (56),  (56'). 

Second  Mixture  Theorem.  The  necessary  and  sufficient  condition 
that  a  conservative  mechanism  have  the  mixing  property  (in  the  wider 
sense)  is  that  the  symmetric  product  mechanism  be  metrically  transitive" 
Proof.  The  necessity  of  the  metric  transivity  follows  from  the 
remark  at  the  end  of  section  9." 

Suppose,  now,  the  product  flow  to  be  metrically  transitive.  This 
implies,  of  course,  the  metric  transivity  of  the  given  flow  on  12.  We 
have,  therefore, 


r 


<F,\> 

<  1, 1  > 


(i.D*  ^ 


and  hence  (57),  c.  q.  e.  d. 


**  Consequence ;  The  metric  transivity  of  a  product  flow  (better  square  flow) 
implies  the  mixture  property  of  this  flow. 

**  This  applies  also  to  mixing  in  the  wider  sense. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


93 


Another  necessary  and  sufficient  condition  for  the  mixing  property  is 
that  every  single  transformation  T ,(Jt  ^  0)  of  the  group  be  metrically 
transitive  (complete  transivity)." 

We  shall  now  apply  the  same  simple  idea  in  order  to  obtain  a  criterion 
for  statistic  regularity  of  an  event  produced  by  a  conservative  mech¬ 
anism.  Statistical  regularity  shall  be  considered  in  the  wider  sense,  i.e. 

(58)  lim  T  ((/_„  -  (/,  1)  L(A)1*  dt  ~  0, 

^  —  a  “  00  P  ®  Ja 


shall  hold  for  any  summable  /  ^  0,  L(A)  being  independent  of  /.  If 
the  relative  frequency  L{A)  exists  it  must  equal 


(58') 


(A  (/*,  <(>*) 
~  (/,!)  “  (AD 


We  remember  that  the  statistic  regularity  of  A  implies  that  the  a  priori 
probability  of  A  be  independent  of  the  special  invariant  measure  used  to 
compute  that  probability  (this  applies  to  the  wider  definition  (58)  as 
well).  The  converse  is,  however,  not  true.**  However,  we  prove  the 

Theorem  on  Statistical  Regularity.  A  necessary  and  sufficient  condi¬ 
tion  that  an  event  A  be  statistically  regular  with  respect  to  a  oon- 
sen  ative  mechanism,  is  that  the  a  priori  probability  of  the  simul¬ 
taneous  event  A  X  A  he  independent  of  the  special  invariant 
measure  used  to  compute  it  (invariant  under  the  symmetric 
product  mechanism). 

Proof.  The  necessity  follows  from  the  independence  theorem  (which 
holds  also  in  the  sense  (58)  of  statistical  regularity)  and  from  the  remark 
just  made  on  the  a  priori  probability. 

Suppose,  now,  that 

m'  (.4  X  A) 

y  (n*) 


m'  being  a  finite  measure,  invariant  under  the  product  flow,  and  being 
comparable  to  the  given  invariant  measure 

M  *  /  dm^  dmn, 


**  E.  Hopf,  I.  c.**.  Examples  of  mixing  flows  (in  the  wider  sense)  have  been 
discovered  by  J.  v.  Neumann,  Annals  of  .Math.  33  (1932),  587-642. 

**  It  would  be  true  under  the  additional  condition,  that  be  orthogonal  to  all 
proper  functions. 


94 


EBERHARD  HOPE 


docH  not  depend  upon  the  special  On  settinx 
4>(»-)  “  <(>a{P)  *Pa{Q) 

for  the  characteristic  function  of  the  simultaneous  event  A  y.  A,  this 
is  found  to  imply  that 

<_//,  4>  > 

<  //,  1  > 


(remember  that  in  <  . . ,  >  the  element  of  integration  is  dy)  have  the 
same  value  for  all  summable  and  invariant  functions  //(r)  ^  0,  //(t) 
m  0.  On  applying  this,  once  to  the  case  ^  in  (56), 

(56'),  secondly  to  the  case  //  «  F*,  we  find,  since  f'  and  F*  are  both 
invariant, 


<  F*,  ^  <  P,it>  > 

<  F*,  1  >  “  <  ^,  I  > 


/.(A)*). 


According  to  <  F*,  1  >  »  <  f’,  1  >  —  (/,  1)*  —  (/*,  1)*  *  </',!> 
this  reduces  to  (57),  which  completes  the  proof. 

The  siKnificance  of  this  theorem  consists  in  that  it  fpves  the,  at  least 
theoretical,  means  how  to  recognize  the  statistic  regularity  of  an  event 
from  the  intrinsic  properties  of  the  mechanism  producing  it. 

13.  Freqwncy  in  Sequencer.  Frequency  phenomena  have,  so  far, 
been  considered  in  an  idealized  fashion  (continuously  many  experiments, 
described  by  distribution  functions).  It  is,  however,  to  be  expected 
that  statistical  regularity  in  this  sense  must  logically  imply  a  similar 
regularity  of  occurence  in  an  infinite  sequence  of  experiments.  In  other 
words,  there  must  be  a  'true’  law  of  large  numbers  which  represents  a 
statement  about  what  is  actually  going  to  happen  when  an  experiment 
is  repeated  a  large  number  of  times,  under  given  causal  conditions.*' 

It  is  evident  that  the  statistical  regularity  of  an  event  pnnluced  by  a 
causal  mechanism  should  imply  the  same  frequency  in  sequences  ‘in 
general,’  not,  however,  in  every  individual,  theoretically  possible 
sequence.  We  start  with  a  conservative  mechanism  with  respect  to 
which  a  certain  event  A  is  statistically  regular.  A  sequence  of  experi¬ 
ments  consists  in  a  sequence  Pi,  Pt,  Pt,  ...  of  initial  phases  in  W. 
The  sequence  of  events  produced  (after  elapse  of  the  time  t)  is  repre¬ 
sented  by  the  setjuenoe  of  corresponding  phases  T,{Pi),  T,(Pt),  - 

How  often  a  point  of  such  a  sequence  falls  into  the  part  A  of  U,  must  be 


The  claasiral  theorem  of  large  num>>er8  expresaea  much  leaa. 


ON  CArSALITY,  STATISTICS  AND  PROBABILITY 


95 


investigated  by  introducing  a  measure  in  the  totality  of  all  sequences. 
We  first  consider  the  totality  li*  of  all  sequences 

S’. . . P-i,  P-\,  Po,  Pi,  Pt,  . . . ;  P<  C  ft. 

Subsets  of  ft*  are  called  a,  0,  ....  We  consider  a  finite  measure  n(a) 
on  ft*  in  the  sense  of  section  4,  the  generating  measure  fh  on  ft  being 
comparable  to  the  given  measure  m  on  ft,  but  otherwise  perfectly  arbi¬ 
trary,  i^(ft)  —  1. 

It  is  convenient  to  introduce  the  following  /i-preserving  and  one  to 
one  transformation  T  of  ft*  into  itself, 

S'  -  T(s),  p:  -  Pi^i, 

i.e.  we  shift  the  ‘coordinates'  of  S  one  step  to  the  left.  This  trans¬ 
formation  obviously  leaves  the  mejisure  n  of  every  set  of  the  form  (7) 
invariant.  Since  all  measurable  sets  a  C  ft*  are  generated  by  sets  (7), 
T  is  generally  M-pre8er\’ing.  The  most  remarkable  property  of  T  is 
that  it  has  the  mixture  property,  i.e.  that 

(59)  lim  f  Fins))  G(S)  dn  -  /  _  Fdn  I  Gdn 

holds  for  any  summable  P(S)  and  for  any  bounded  Cr(S).”  According 
to  lemma  2  and  to  its  corollary  it  is  sufficient  to  prove  (59)  in  the  case 
where  F  and  0  both  depend  only  upon  a  finite  number  of  ‘cemrdinates’. 
In  this  case,  however,  (59)  is  evident  because,  for  sufficiently  large  n, 
the  arguments  of  the  two  factors  on  the  left  are  totally  different. 

The  mixture  property  implies  the  metric  t  ransivity  of  T.  The  ergodic 
theorem  shows,  therefore,  that 

(60)  lim  -  y\  Fins))  -  /  .  FiS)d,x, 

n~»n^  yu- 

FiS)  being  summable  over  ft*,  holds  for  almost  all  sequences  S,  ‘almost 
air  in  the  sense  of  the  measure  fi. 

Let  us,  now,  consider  one-sided  sequences 

2:P„  P,,  P, . 

The  measure  m  of  a  set  of  sequences  Z  shall  simply  be  defined  by  the 
measure  n  of  the  set  of  sequences  S,  obtained  by  attaching  all  possible 

'*  The  reader  will  rerofuiiae  the  aimilarity  of  the  tranafonnation  T  to  the  mixing 
proreaa  of  aertion  8,  in  particular,  its  representation  in  a  scale  of  notation. 


96 


EBERHARD  HOFF 


left  hand  tails.  From  now  on,  measure,  measurability,  summability  is 
meant  in  this  sense. 

Lemma  on  Sequences.  For  any  (tiven  summable  FiX),  the  equation 
1  ■  - 1 

Urn  -  y.  F{T'{Z))  -  /  F(2)dM 


holds  for  almost  all  sequences  2  in  the  sense  of  the  2-measure  ti¬ 


ll 


F(2)  -  F{Pu  P, . Pi) 

depends  upon  a  finite  number  of  the  ‘coordinates’  only,  we  find 

(61)  lim  -  y  F(P,  +  „  . . .,  P*+,)  -  f  ...  f  Fdtrii  . . .  dMi 

,  - «  n  ^  Ju  Ja 

along  almost  all  sequences  2,  in  particular,  if  P  «  P(Pi)  F(P), 

(62)  lim  -  y  FiP,)  -  f  Fdm.** 

II  -  •  ”  Ju 


It  must  not  be  forgotten,  however,  that  this  statement  depends  essen¬ 
tially  upon  the  choice  of  the  measure  m  on  ft,  which  generates  the 
measure  m  <>n  H".  For  another  initial  measure  ifi  on  fl  (and  therefore 
another  m)  the  same  statement  holds,  but  the  value  of  the  right  hand 
side  of  (62)  might  be  different. 

We  must,  now,  consider  the  mechanism  Ti{P)  which,  so  far,  has  not 
been  taken  account  of  at  all.  The  event  A  is  supposed  to  be  statistically 
regular  with  respect  to  Ti.  Since  an  arbitrary  measure  on  tl  com¬ 
parable  to  m,  is  connected  to  m  by  means  of  a  weight  function  /(P), 
the  right  hand  side  of  (62)  equals,  according  to  m{U)  *  1, 


••  The  reader  will  readily  recognize  herein  the  riaaaical  Bernoulli  theorem.  It 
is,  however,  to  be  emphasized  that  (62)  appears  here  as  a  mere  mathematical  to<il 
necessary  to  express  the  ‘true’  law  of  large  numliers,  (62)  can,  of  course,  be 
obtained  by  the  usual  method  of  proof  which,  moreover,  leads  to  quantitative 
estimates,  in  the  case  where  /  is  the  characteristic  function  of  a  part  of  U.  Since 
we  are  pursuing  but  qualitative  questions,  we  found  it  convenient  to  let  (62) 
appear  as  a  consequence  of  the  ergodic  theory. 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY  97 

When  for  F  the  choice 

(63)  F(P)  -  v».(r.(P))  »  ,(P) 

is  made,  ^a{Q)  being  the  characteristic  function  of  the  event  A,  the  left 
hand  side  of  (62)  represents  the  relative  frequency  of  /I  in  a  sequence  of 
experiments  with  the  initial  phases  Pi,  Pi,  ...  and  with  the  outcomes 

Ti{Pi),  Ti{Pt),  _ The  right  hand  side  tends  to  L{A),  as  <  — »  «, 

independent  of  the  measure  M.  All  this  may  be  expressed  by  the 
Frequency  Theorem.  Let  the  event  A  be  statistically  regular  with 
respect  to  a  causal  mechanism  Tt(P).  liCt,  furthermore,  u  denote  a 
measure  within  the  totality  of  the  sequences  2,  —  1.  For  an 

arbitrarily  given  sequence  of  f-values  that  converge  to  infinite, 
there  is  a  set  a  of  sequences  2  of  measure  ^  ^  0  such  that 

hm  Urn  I 

holds  along  that  I-sequence  and  along  any  2  which  does  not  belong 
to  a.  This  statement  is  true  for  every  measure  y. 

Less  precisely  expressed,  the  event  A  appears  ‘in  general’  with  the 
approximate  relative  frequency  L{A).  Although  this  theorem  is  proved 
here  for  measures  m  of  a  very  restricted  type,  it  should  turn  out  to  be  true 
under  much  more  general  hypotheses. 

On  the  Impossibility  of  a  Gambling  System.  Let  us  return,  for  a 
moment,  to  a  measure  u  on  the  totality  of  all  sequences  2.  If  two 
functions  P(2),  G(2)  satisfy  the  relation 

(64)  JFGdp  -  JFdyJGdp, 

we  find  from  the  lemma  of  the  preceding  section  that 

n  —  I 

2  P(7’'(2))G)(7’'(2)) 

(65)  lim  °  _  I - *  /  Fdy 

2  GiT'il)) 

0 

holds  for  almost  all  sequences  2.  Let  F  —  P(P*+i)  where  P(P)  is  the 
function  (63),  and  let 


G  ~  G{Pu  ....  Pk)  ^0. 


98 


EBERHARD  HOPF 


(64)  being  satisfied,  (65)  l)eounie8 


m{A  -i) 

mi) 


From  the  statistical  regularity  of  A  we  obtain  therefore  the 

Generalization  of  the  Frequency  Theorem.  A  perfectly  analogous 
statement  holds  when  the  expression  occurring  there  is  replaced 
by  the  left  hand  expression  of  (66). 

When  G  takes  but  the  values  sero  and  one,  the  left  hand  side  of  (66) 
represents  the  relative  frequency  of  the  event  A  within  a  certain  sys¬ 
tematically  selected  subsequence.  Such  a  selection  does,  therefore,  not 
affect  the  theoretical  frequency.  When  a  gambler  applies  a  system  of 
selection  in  such  a  way  that  he  makes  his  stake  dependent  upon  a 
certain  number  of  previous  outcomes,  he  will  in  general  fail  to  make  a 
profit  out  of  it.  V'arious  generalizations  of  these  considerations  are 
evidently  possible. 

14.  Dissipative  Mechanisms.  Roulette  and  Buffon’s  Needle.  The 
well  known  phenomena  connected  with  the  coin,  die,  Buffon’s  needle 
and  roulette  are  actually  produced  by  dissipative  mechanisms.  Since, 
unfortunately,  an  analogue  of  the  ergodic  theory  has  not  been  developed 
in  these  cases,  we  have  to  confine  ourselves  to  a  few  remarks  and  to 
simple  examples. 

The  simple  and  more  or  less  well  known  treatment  of  the  roulette 
problem  serves  as  a  good  illustration  for  our  purposes.  For  the  sake  of 
simplicity,  let  us  suppose  that  the  wheel  starts  always  from  the  normal 
position  (v»  —  0).  The  final  position  will  be  an  increasing  function 

<p  »  F(u) ;  F(oi)  — >  «o ,  u)  — »  oo  ; 

of  the  initial  velocity.  We  consider  a  certain  range  (a,  a  -|-  6)  of  initial 
velocities  and  ask  for  the  relative  measure  of  the  part  of  (a,  a  +  b) 
which  leads  to  final  posit  ions  ^  within  a  given  sector  (^o.  <Pi)t  >-c.  within  the 
interv'als 


*  +  » 


2  X>aJP,)  OiP.-k.  p,-x) 
(66)  lim  *—  YVi - — — — — - 

*  + 1 


(^0  +  2ns,  ipi  +  2ns),  n  =  0,  1,  2,  ...» 

{•fix  —  ^0  <  2s).  Under  what  condition  does  this  relative  measure  ap¬ 
proach  the  fraction  (v?i  —  ipo)l2sl  First  of  all,  the  initial  point  a  of  the 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


99 


mentioned  (^-interval  should  tend  to  infinity  (the  wheel  must  be  turned 
rapidly  in  order  to  produce  the  effect  of  uniform  distribution).  How 
the  length  b  of  the  (i>-interval  must  behave  at  the  same  time,  depends 
upon  the  properties  of  F(u)  for  large  w. 

Let  us  call  a  sequence  of  intervals  (a,  a  -|-  6)  (o  — »  « )  a  ‘macroscopic’ 
sequence  if  the  effect  of  uniform  distribution  takes  place  in  it.^  Let  us 
first  find  the  conditions  under  which  the  ci>-inten’als  of  constant 
length  (6  independent  of  a)  are  macroscopic.  Such  conditions  are 


(67) 
and 

(68) 


lim  F'(w)  s  CO 


lim 

M  « 


F'(u  4-  s) 

F'M 


1, 


uniformly  in  every  finite  «-interval.  The  second  condition  is  certainly 
fulfilled  if  F” IF'  — » 0  as  w  — ♦  «.  The  physical  meaning  of  (67)  is  that 
the  friction  r(w),  defined  by  ^  —  —r(u),  thus  3rielding 

/•  u 

T1 

r(co) 

be  small  in  comparison  with  the  speed,  for  large  u. 

(67)  and  (68)  imply  that 


lim 

a  *■  00 


o)  piFiu) )  doi 


w  —  a)  do) 


holds  for  any  gix),  summable  in  (0,  *  ),  and  for  any  p(v>)  that  is  bounded, 
measurable  and  periodic  with  the  period  2t.  It  is  only  necessary  to 
prove  this  when  g{x)  is  the  characteristic  function  of  an  interval,  for 
instance,  (0,  b).  The  left  hand  side  then  becomes 


lim 

a  «  ao 


**  It  ia  important  to  realize  the  necessity  of  this  and  analogous  notions  in  all 
dissipative  mechanisms  that  are  connected  with  frequency  phenomena.  The  size 
of  these  macroscopic  regions  can  vary  in  different  ways  when  the  parameter  upon 
which  the  degree  of  distribution  depends,  tends  towards  its  limit.  The  practical 
condition  for  actual  occurrence  of  the  frequency  phenomena  is,  as  before,  that  the 
‘regions  of  inaccuracy’  he  themselves  macroscopic  regions. 


100 


EBERHARD  HOPE 


or  on  introducing  the  inverse  function  of  F,  w  »  Gitp), 


(69) 


Since,  in  virtue  of  (68), 

lim  *  1,  ^  ^  ^  v»",  a  —  X, 

G'{i(>)  can  simply  be  suppressed  in  the  left  hand  quotient  of  (69),  thus 
yielding  sensibly 


p{*p)difi. 


This  tends  to  the  required  limit  as,  according  to  (67), 

F'Mdu-^  X. 


Another  case  of  interest  is  that  in  which  the  macroscopic  intervals 
increase  in  length  proportional  to  the  speed  (b/o  *  const. )^‘  We  then 
find,  that  the  conditions 


lim  wF'ioi) 


lim 


Qi) 

F'iw) 


0 


are  sufficient  for  the  validity  of 


^(x)  being  nowhere  negative  in  (1,  x)  and  p(^)  being  bounded  and 
periodic  with  the  period  2r.  An  important  special  case  is  F{u)  u,  i.e. 
the  friction  r(w)  is  proportional  to  the  speed. 

As  the  mathematical  problem  connected  with  Buffon’s  needle  experi¬ 
ment  (a  needle  bouncing  imperfectly  elastically  on  a  rough  floor)  is 

It  might  he  imagined  that  the  inarrurary  in  the  initial  speed  is  proportional 
to  the  speed  (in  analogy  to  Feehner’s  law  in  optics). 


ON  CAUSALITY,  STATISTICS  AND  PROBABILITY 


101 


extremely  difficult,  we  confine  ourselves  to  the  much  simpler  and  equally 
well  performable  case  where  the  object  (needle,  rod,  slab)  moves  within 
the  plane.  Ekiuidistant  parallel  lines  are  drawn  upon  a  practically 
infinite  floor,  the  distance  being  6.  A  rod  (in  form  of  a  long  parallele¬ 
piped)  of  length  I,  initially  at  rest  on  the  floor,  is  vigorously  pushed  on 
one  end,  so  that  it  rapidly  rotates  and  moves  across  the  lines.  The 
relative  frequency  with  which  the  long  axis  of  the  rod  (marked  somehow 
on  the  rod),  when  again  at  rest,  comes  to  lie  across  one  of  the  lines,  will 
sensibly  be 

?L. 

t6 

More  generally,  all  combinations  of  angular  orientation  and  position  of 
the  center,  relative  to  the  lines,  will  occur  approximately  equally  often. 

We  take  an  object  in  the  form  of  an  arbitrary  plane  figure  moving  in  a 
rough  plane.  Let  x  be  the  coordinate  of  the  center  of  gravity  perpen¬ 
dicular  to  the  lines.  Two  values  x'  ■  x  (mod.  6)  are  to  be  regarded 
identical.  Let  measure  the  angular  displacement.  The  equations  of 
motion  are 

.  ^  -  w,  X  -  M, 

1 6)  ~  AT,  Afti  -  F, 

M  being  the  mass;  I  the  moment  of  inertia;  and  F  being  the  resultant  of 
frictional  forces;  N  their  torque.  About  the  friction,  many  assumptions 
can  be  made.  The  equations  can,  however,  at  once  be  integrated, 
when  the  frictional  force  upon  the  mass  element  dm  is  supposed  to  be 

—ft  b  dm 

b  being  the  speed  vector  and  ft  being  a  constant.  We  do  not  care  here, 
that  in  reality  this  law  of  friction  is  not  obeyed.  The  equations  become 

lil  ™  —ftU),  U  “ 

the  integration  of  which  gives  the  final  position  (x',  ip') 

U 

x'  —  X  +  ^  ~ 

M  M 

as  a  function  of  the  initial  phase  (x,  ip,  u,  u).  This  is,  mathematically, 
equivalent  to  the  product  mechanism  of  two  roulette  models  of  the 
second  kind,  friction  proportional  to  speed.  According  to  the  inde- 


102 


EBERHARD  HOPF 


pendenoe  theorem,  which  is  easily  extended  to  the  present  case,  the 
required  frequency  effect  is  thus  obtained,  if 

U  — ♦  « ,  w  — »  00 

the  macroscopic  regions  being  four  dimensional  parallelepipeds  with  sides 
of  constant  length  in  x,  tp  and  with  sides  in  u,  w  proportional  to  these 
speeds. 

Actually,  the  friction  is  much  smaller  for  large  speeds,  and  the  macro* 
scopic  regions  will  therefore  be  of  smaller  extent  in  u,  w. 

Another  way  of  treating  the  problem  is  to  keep  the  velocity  range 
fixed  and  to  let  This  is  immediately  seen  to  lead  to  the  same 

frequency  phenomenon. 

Remark  to  ^IS.  The  relation  (66)  does  not  quite  correspond  to  the 
interpretation  given  subsequently.  A  system  of  selection  makes  use 
of  the  previous  outcomes,  i.  e.  of  the  values  of 

MUP,.k)) . 

If,  in  (66),  G  be  replaced  by  a  function  of  these  values,  the  conclusion 
remains  the  same  and  the  interpretation  becomes  correct. 


NON-RIEMANNIAN  DYNAMICS  OF  ROTATING 
ELECTRICAL  MACHINERY 

Bt  Gabribl  Kron* 


SYNOPSIS 


a.)  It  is  shown  that  the  charges  through  the  terminals  (brushes,  taps, 
etc.)  of  the  innumerable  types  of  rotating  electrical  machines  (except 
of  the  alternator)  are  not  true  I.Agrangian  coordinates  and  the  usual 
form  of  the  Equation  of  Motion  of  Lagrange 

d  /dr\  _  ^  ^  * 

dt  \di-/  dx*  di-  “ 

as  interpreted  by  Maxwell  for  closed  electric  circuits  moving  in  a  mag¬ 
netic  field  is  not  valid  for  them,  neither  is  the  Equation  of  Voltage  (derived 
from  the  former  equation  by  ignoring  the  geometrical  variables.) 

^  _  D  .-a  I  _  D  .-a  L 

e.  -  R.,.f  +  -j-  -  ^ 


since  the  coordinate  axes  are  not  connected  to  the  moving  conductors. 

The  generalised  form  of  the  Equation  of  Motion  valid  for  all  electrical 
machinery  and  in  general  for  non-holonomic  dynamical  systems  has  been 
given  by  BoltzmanHattd  Hamel  as 


d 

dt 


dx-  ax* 


\dx-  dx*/  ’ 


+ 


dz* 


/. 


xchich  includes  as  a  special  case  the  usual  form.  The  generalized  form 
of  the  Equation  of  Voltage  for  rotating  electrical  machinery  becomes 


D  :a  I  diLmai*)  ,  j 


where:  1.)  the  induced  voltage  L^i^  dl  is  due  to  the  variation  of  the 
currents  2.)  the  Coriolis-voltage  (dLm$,  dt)i>  is  due  to  the  motion  of 
the  coordinate  axes  and  3.)  the  last  term  is  a  voltage  due  to  the  motion 
of  the  conductors.  , 

b.)  The  theory  of  non-holonomic  dynamical  systems  is  developed  in 
*  Enfineering  General  Dept.,  General  Electric  Co.,  Schenectady,  N.  Y. 

103  PHYSICS  ^iERARY 


104 


GABRIEL  KRON 


tensor  symbolism  and  it  is  found  that  the  usual  explicit  form  of  the 
Equation  of  Motion  for  holonomic  dynamical  systems,  that  is 


-  ,  dx*  dx’’  ,  o  dx^  , 

+  3r  rfT  ■ 


is  also  valid  for  "quasi-holonomic”  dynamical  systems  (a  special  case  of 
non-holonomic  dynamical '  systems  in  which  gravitational  and  electro¬ 
magnetic  energies  coexist)  provided  the  Christoffel  symbol  of  Riemannian 
Geometry  is  replaced  by  the  generalized  asymmetrical  Christoffel  symbol 
of  Son-Hiemannian  Geometry  subject  to  the  restriction  that  the  co¬ 
variant  derivative  of  the  metric  tensor  a.^  is  zero.  In  other  words  it  is 
shown  that  the  explicit  form  of  the  generalized  Equation  of  Motion 


^m$ 


+ 


djr*  dx^  „  dx* 

-di  Hi A 


/. 


remains  invariant  with  respect  to  all  quasi-holonomic  tranfformations. 

c.)  Practical  examples  of  such  dynamical  systems  are  the  infinite 
varieties  of  electrical  machines  that  have  not  been  analyzed  before 
from  a  dynamical  point  of  view.  Hence  as  quasi-holonomic  dynamical 
systems  the  performance  of  all  rotating  electrical  machinery  is  described 
during  acceleration  by  the  Pkjuation  of  Motion 

“  Rmfi*  +  L^di^tdl  «»•»'*' 

where  «  is  the  asymmetrical  generalized  Christoffel  symbol  and 
is  the  metric  tensor  representing  all  the  self  and  mutual  inductances 
of  the  windings  and  the  moment  of  inertia  of  the  rotor.  The  per¬ 
formance  of  any  particular  machine  is  found  from  that  of  a  repre¬ 
sentative  machine  by  a  mutine  transformation  of  cfxirdinate  axes  with 
the  aid  of  a  transfomiation  tensor  which  is  nothing  else  but  a  mathemat¬ 
ical  representation  of  the  customary  connection  diagram. 

By  considering  their  connection  tensors  (or  connection  diagrams)  the 
various  types  of  rotating  electrical  machines  may  also  be  looked  upon  as 
holonomic  or  non-holonomic  sub-spaces  of  one  representative  non-Rieman- 
nian  space.  However  each  non-holonomic  sub-space  may  also  be 
considered  in  its  own  right  as  a  non-Riemannian  space  because  of  the 
existence  of  electrical  non-holonomic  coordinates. 

When  various  types  of  rotating  (or  stationary)  electrical  apparatus 
are  connected  in  any  number  and  in  any  manner  whatever,  all  the 
operators  of  the  group  are  found  simply  by  adding  the  particular 
operators  of  the  individual  units  and  transforming  the  sum  by  the 


NON-RIEMANNIAN  DYNAMICS 


105 


transfonnation  t«naor  of  the  group.  That  is  entire  rotating  machines  are 
added  as  simple  impedances. 

The  Equation  of  Motion  also  can  be  written  as 

fm  "  Rm$i*  +  L^di^/dt  -|-  [7/9,  a]  i^^  Tytmi^i* 

where  (7^,  o)  is  the  ordinary  Christoffel  symbol  and  Ty0^  is  a  skew- 
symmetric  tensor  of  rank  three.  Also  (7^,  is  the  Coriolis  voltage 
and  Ty$mi''i^  is  the  rotor  generated  voltage  and  torque. 

During  an  infinitesimal  disturbance  (hunting)  the  Equation  of  Small 
Oscillations  of  all  machines  is 

de,  -  R,0dif  +  L,t  +  T,y.Jti»iy  +  Tty^m^  +  iH^di*. 

at  OJT 

d. )  For  the  special  case  of  constant  speed  the  Exiuation  of  Motion 
reduces  to 

e.  -  Z^i» 

where  Z^s  is  the  “transient-impedance  tensor.”  The  P>]uation  of  Small 
Oscillation  reduces  to 

de,  -  Z^i» 

where  Zm$  is  the  “motional-impedance  tensor.”  The  routine  performance 
calculation  of  the  various  single  and  interconnected  machines  is  immensely 
facilitated  by  the  use  of  these  impedance  tensors. 

The  ”per-unit”  concept  used  in  synchronous  machines  is  generalised 
and  is  extended  to  all  rotating  machines  by  intnxlucing  the  “contra- 
variant”  voltage  vector  e",  the  generalized  Christoffel  symbol  of  the 
second  kind  etc. 

e. )  In  order  to  eliminate  in  the  routine  performance  calculation  of 
the  various  rotating  machines  the  necessity  of  using  more  or  less  compli¬ 
cated  formulae  for  the  transformation  of  symbols  to  various  coordinate 
systems,  the  *‘tensor'’  concept  is  introduced.  In  terms  of  tensors  the 
Equation  of  Motion  becomes 

C.  *  Rmfi^  "H  LmS  ii^/it 

where  6/U  represents  “absolute”  or  “covariant”  differentiation.  The 
Equation  of  Voltage  becomes 

f.  -  +  V./«. 


106 


GABRIEL  KRON 


Thete  forms  show  that  the  equations  of  rotating  machinery  are  identical 
wUh  those  of  stationary  networks  if' ordinary  differentiation  is  replaced  by 
absolute  differentiation.  ()r  in  other  words  the  appearance  of  the 
gravitational  mass  of  the  rotor  during  acceleration  in  a  purely  electro- 
moyneltc  system  changes  all  ordinary  drivatives  into  absolute  derivatives 
(^*4/dx>»0)  anditendowes  the  electrical  coordinates  x''  with  cylin¬ 
drical  properties.  The  Equation  of  Small  Oscillations  becomes 

4c,  “  +  LmS  —T—  d"  -|-  Ry$ai^dx^ 

ot 

where  Ktyg^  is  the  generalized  Riemann-ChristofTel  curvature  tensor 
and  Rygm  is  a  tensor  of  rank  three  intnxluced  by  the  resistances. 

f. )  These  equations  express  the  fact  that  the  performance  calculation 
of  rotating  electrical  machinery  is  primarily  a  problem  in  mathematical 
physics.  It  is  the  problem  of  the  motion  of  a  particle  in  an  n-dimensional 
non-Riemannian  (affine)  space  with  asymmetric  connection  (vnth  torsion) 
acted  upon  by  a  positional  (non-conservative)  force  and  opposed  by  a  fric¬ 
tional  force  proportional  to  its  velocity. 

g. )  Throughout  the  paper  emphasis  has  been  laid  upon  a  physical 
interpretation  of  all  abstract  concepts  introduced  from  the  Absolute 
Calculus  and  of  the  phenomena  taking  place  inside  of  all  machines 
during  acceleration  and  hunting.  Complete  examples  are  worketl  out 
for  the  case  of  a  salient-pole  synchronous  machine  with  amort isseur 
windings  having  stationary  and  also  moving  rotor  coordinate  axes, 
and  for  the  most  general  unbalanced  asymmetrical  induction  motor 
Scattered  examples  for  various  other  machines  are  also  given.  An 
attempt  is  also  made  to  explain  briefly  all  symbols  new  to  the  general  reader. 

All  examples  worked  out  by  the  dynamical  method  give  identical 
results  with  those  found  by  older  methods  by  various  writers  like  Park, 
Doherty,  Nickle  etc.  for  salient-pole  synchronous  machines,  Arnold, 
Dre3rfus,  Lyon  etc.  for  induction  and  commutator  machines. 

The  step  from  complex  numbers  which  electrical  engineers  are 
acquainted  with  to  hyper-complex  numbers  this  paper  deals  with  is  logical 
and  is  also  inevitable  sooner  or  later  as  the  complexity  of  the  electrical 
systems  to  be  analized  increases. 

INTRODUCTION 

a.)  At  the  present  time  the  performance  calculation  of  the  various 
types  of  rotating  electrical  machinery  is  still  in  an  unorganized  state. 
In  a  general  way  it  can  be  stated  that: 


NON-RIEMANNIAN  DYNAMICS 


107 


1. )  Each  rotating  machine  hoe  a  different  theory  so  that  an  engineer 
who  can  calculate  the  performance  of  say  the  single-phase  induction 
motor  is  usually  unacquainted  with  the  performance  calculation  of  the 
salient-pole  alternator  and  vice-versa.  The  learning  of  the  perform¬ 
ance  calculation  of  each  machine  requires  usually  an  intensive  study 
of  many  months. 

2. )  Each  leading  engineer  set»  up  hit  own  theory  about  the  machine  he 
specializes  in  and  as  a  consequence  the  engineer  who  is  acquainted  with 
the  method  of  say  Steinmetz  or  Arnold  for  the  calculation  of  the 
alternator  must  learn  an  entirely  new  language  and  procedure  in  studying 
the  method  of  say  Park  or  Doherty  and  Nickle  for  the  same  alternator, 
juet  as  if  it  were  an  entirely  different  machine. 

In  the  study  of  one  particular  machine  the  engineer  has  a  selection 
among  innumerable  theories  of  innumerable  writers  ranging  from  purely 
analytical  to  purely  graphical  through  various  degrees  of  semi-analytical 
and  semi-graphical  meth<xls,  each  theory  being  usually  independent  of 
the  other.  The  diversity  is  still  more  emphasized  by  the  various 
assumptions  as  to  the  nature  of  the  design  constants,  etc. 

In  the  purely  analytical  study  of  one  machine  the  procedure  usually 
consists  of  setting  up  a  set  of  equations  for  the  voltages,  another  set 
for  the  fluxes,  a  third  set  of  equations  for  the  torque  etc.  The  form  of 
the  equations,  however,  is  different  tvith  different  writers  and  different 
theories.  The  analysis  of  all  these  sets  of  equations  is  long  and  weary- 
some  due  to  the  most  primitive  type  of  arithmetic  that  can  be  utilized 
only  with  the  customary  methods  of  attack. 

The  above  statements  refer  mainly  to  the  steady-state  analysis  of 
machines.  Lately  the  knowledge  of  the  transient  performance  of 
machines  is  becoming  of  increasing  importance  and  the  articles  that 
already  appeared  indicate  that  the  same  multiplicity  of  equations  and 
variety  of  theories  for  each  machine  that  makes  the  study  of  the  steady- 
state  performance  a  drudgery  will  be  more  than  ever  an  incentive  for 
the  engineer  to  follow  in  his  studies  the  path  of  least  resistance,  that  is 
to  set  up  his  own  theory  rather  than  try  to  6nd  his  way  in  a  jungle 
adding  thereby  to  the  confusion.  Curiously  enough  it  is  the  response 
of  rotating  machinery  to  transient  disturbances  that  brings  out  most 
emphatically  the  fundamental  identity  {group  property)  of  all  rotating 
machines,  more  particularly  it  brings  out  the  facts  that: 

1  )  all  machines  have  the  same  magnetic  structure  and  differ  only 
in  the  manner  of  connection  of  the  electric  circuits, 

2.)  in  all  circuits  of  all  types  of  machines  the  same  simple  physical 
phenomenon  occurs,  namely  «  *  ir  -|-  d<fi  dt  +  X  velocity. 


108 


GABRIEL  KRON 


And  in  spUe  of  the  fundamental  identity  of  all  machines  and  of  all 
circuit  phenomena  there  are  as  many  machine  theories  as  there  are  types 
of  machines  and  there  are  types  of  engineers. 

b.)  It  in  the  purpose  of  this  paper  to  point  out  to  electrical  enfcineers 
that  there  exists  a  new  and  powerful  branch  of  mathematics,  known 
under  various  names  of  Tensor- Analysis,  Absolute-Calculus,  Ricci- 
('alculus,  etc.,  which  is  most  ideally  suited  to  unify  the  analysis  of  the 
innumerable  types  of  rotating  machines.  The  power  of  the  Calculus 
lies  in  the  fact  that  it  unifies  the  analysis  of  a  large  variety  of  pn)blem8; 
in  particular 

1. )  it  reduces  the  analysis  of  the  differential  equations  of  a  whole 
group  of  similar  systems  (like  the  group  of  all  rotating  electrical 
machines)  into  the  analysis  of  one  representative  member  of  the  group 
and  the  solutions  for  the  other  members  of  the  group  are  found  simply 
by  a  routine  transformation  of  ctmrdinates. 

2. )  it  sets  up  identical  equations  for  each  degree  of  freedom  of  the 
representative  system. 

Although  the  Absolute  C'alculus  found  its  first  physical  application 
in  relativity  dynamics,  it  is  now  very  extensively  used  in  pn)blems  of 
classical  dynamics  also.  These  classical  problems  mostly  deal  with 
h(»lonumic  dynamical  systems  with  true  I..agrangean  coordinates  and 
are  formulated  in  terms  of  the  motion  of  a  particle  in  a  Riemannian 
space.  When  non-holonomic  constraints  or  non-holonomic  transforma¬ 
tions  are  introduced  additional  forces  appear  so  that  the  particle 
describes  paths  in  a  Riemannian  space  whose  geometry  is  called  ‘‘non- 
holonomic.”  In  electrical  machinery  the  paths  assume  a  special  form 
and  their  performance  can  be  considered  as  the  motion  of  a  particle 
in  a  non-Riemannian  space  vrixh  asymmetric  connection.  As  far  as  the 
writer  is  aware  the  equations  of  a  non-Riemannian  space  have  been 
used  in  physical  problems  so  far  only  in  the  theory  of  relativity,  in 
particular  in  the  various  unified  field  theories. 

To  avoid  any  misunderstanding  it  is  emphasize<l  that  the  subject 
matter  of  this  paper  has  nothing  whatever  to  do  with  the  theory  of 
relativity.  What  relative  motions  are  considered  they  all  are  relative 
motions  in  the  classical  sense  treated  in  any  b<M)k  on  dynamics.  Also 
it  should  be  emphasized  that  the  mathematical  tool  itself,  the  Absolute 
C'alculus,  has  no  connection  with  relativity.  It  has  been  worked  out 
by  Ricci  about  fifty  years  ago  when  there  was  not  even  any  theory  of 
relativity.  (Gauss,  Riemann,  etc.  laid  the  foundation  of  the  Calculus.) 
The  tool  is  nothing  else  but  a  systematic  treatment  of  the  theory  of  a 


NON-RIEMANXIAN  DYNAMICS 


109 


set  of  linear  differential  equations.  However  it  has  been  immensely 
improved  in  the  hands  of  relativists,  in  fact  the  whole  mathematirat 
apparatus  of  non-Riemannian  space  is  the  creation  of  relativists  during 
the  last  fifteen  years  in  their  attempts  (so  far  futile)  to  unify  the 
differential  equations  of  the  gravitational  and  electromagnetic  6elds. 
It  is  interesting  that  n)tating  electrical  machinery  whose  differ¬ 
ential  equations  are  those  of  a  non-Riemannian  space  are  non- 
holonomic  dynamical  systems  in  which  also  gravitational  and  electro¬ 
magnetic  energies  coexist.  An  important  connection  between  the  rela¬ 
tivistic  dynamical  treatment  and  the  present  classical  dynamical  treat¬ 
ment  of  the  unified  theory  of  gravitational  and  electromagnetic  systems 
is  that  in  both  treatments  the  non-Riemannian  space  is  subject  to  the 
restriction  that  the  co- variant  derivative  of  the  metric  tensor  g««  is  zero. 
Also  the  cylindrical  properties  of  the  electrical  coordinates  are  anal¬ 
ogous  to  those  of  the  recently  developed  five-dimensional  relativity. 
The  cyclical  properties  of  the  C(M)rdinate8  even  suggest  quantum-tlieo- 
retical  analogies.  It  should  be  noted  that  as  an  alternative  to  tlie 
present  treatment  of  the  performance  as  the  motion  of  a  particle,  Hhe 
moving  waves  inside  the  machines  could  have  been  also  analyzed.  It 
seems  that  any  treatment  of  the  non-sinusoidal  space  waves  (subsyn- 
chronous  speeds,  etc.)  must  take  over  a  large  part  of  the  theory  of  in- 
hnite -dimensional  (Hilbert)  spaces  of  modem  quantum-dynamics  (the¬ 
ory  of  spectrum,  etc.).  It  is  also  evident  that  many  other /ormal anal¬ 
ogies  exist  between  the  equations  of  interconnected  rotating  electrical 
machinery  and  those  of  a  group  of  spinning  electrons. 

()f  course  electrical  machinery  offer  a  more  easily  comprehensible 
picture  to  visualize  the  more  elementary  concepts  of  a  non-Riemannian 
space  than  the  unihed  field  theory  does.  It  is  believed  that  the  reader 
will  find  he  has  been  familiar  with  most  of  these  concepts  except  that 
they  did  not  have  such  euphonious  names  as  “metric  tensor,”  “Christo- 
fell  symbol”  etc.,  they  were  only  called  “inductance,”  “generated 
voltage”  etc.  There  is  nothing  in  this  paper  that  can  not  be  understood 
by  anybody  who  ever  attempted  to  understand  textbooks  on  rotating  machine 
performance.  The  mathematical  tool  itself  is  easily  learned,  in  words  of 
leading  authorities  it  has  “almost  miraculous”  power  in  simplifying 
complicated  mathematical  processes  and  last  but  not  least,  it  has  a 
certain  beauty  that  will  positively  appeal  to  certain  types  of  engineers. 

c.)  In  the  early  days  of  rotating  machinery  the  dynamical  point  of  view 
has  been  successfully  applied  to  the  study  of  the  synchronous  machine 
(Hopkinson  etc.)  by  applying  to  it  the  Equation  of  Motion  of  Lagrange 


no 


GABRIEL  KRON 


afl  hafl  been  interpreted  by  Maxwell  for  a  system  where  i^)metrical  and 
electrical  (Laicranipan)  coordinates  are  present.  B\d  the  method  failed 
with  every  other  machine  where  the  coordinate  axes  (brushes,  connections, 
etc.)  were  not  connected  to  the  movinft  conductors.  At  that  time 
non-holonomic  dynamical  systems  were  not  analiied  in  a  systematic 
manner,  only  in  the  present  century  was  their  general  equation  set  up 
by  Bolt  zmanitan<l  Hamel  but  due  to  its  scarcity  of  application  even 
tfslay  it  is  found  only  in  the  most  advanced  treatises  on  dynamics. 

Since  then  sporadic  attempts  were  made  to  build  up  a  dynamical 
theory  of  rotating  electrical  machinery  (Ingram,  Dahlgren,  etc.)  but 
they  all  gave  the  same  equations  found  by  Maxwell,  they  all  used  them 
for  the  alternator  only  (if  at  all)  and  for  no  other  machine.  Similarly 
the  equations  derived  in  treatises  on  elect rod3rnamics  for  closed  electric 
circuits  moving  in  a  magnetic  field  are  those  of  Maxwell  applicable  to 
the  alternator  only,  although  usually  remarks  are  made  that  they  are 
applicable  to  all  dynamo-electric  machines. 

So  Dynamics  dropped  out  of  the  picture  entirely  and  in  its  place 
electrical  engineers  built  up  an  immense  but  loosely  knit  structure  of 
physical  and  mathematical  interpretation  of  the  phenomena  taking 
place  inside  the  various  types  of  machines.  The  dynamical  equations 
of  Lagrange  have  been  expressly  created  to  avoid  just  such  an  eventuality, 
they  were  created  to  predict  the  performance  of  d3mamical  S3r8tems  by 
the  aid  of  certain  functions  measured  at  the  terminals,  without  knowing 
anything  about  the  mechanism  of  the  phenomena  inside  the  system.  In 
other  words  if  the  resistances  and  inductances  of  the  various  windings 
are  given  as  measured  at  the  terminals  and  also  the  diagram  of  connec¬ 
tions  of  the  terminals,  no  matter  how  complicated  they  are  the  dynamical 
equations  can  be  set  up  and  the  transient  and  steady-state  performance 
can  be  calculated  in  a  routine  manner  by  anyone  who  otherwise  is 
ignorant  of  the  various  theories  of  rotating  machines  given  in  textbooks. 

That  is  this  paper  represents  a  formal  mathematical  approach  to  the 
performance  calculation  of  all  rotating  electrical  machinery  and  the  sMing 
up  and  solutions  of  the  dynamical  equations  are  independent  of  any 
physical  theories. 

Of  course  dynamical  equations  may  be  interpreted  physically  after 
they  are  set  up,  but  the  interpretation  should  be  considered  only  as  an 
after-thought.  It  is  employed  mainly  in  an  attempt  to  give  a  physical 
picture  of  the  cr>ncepts  introduced  from  the  Absolute  Calculus  like 
“tensor”  etc.  and  it  has  nothing  whatever  to  do  with  the  dynamical 
set  up  and  their  solutions.  Anyone  interested  only  in  performance 

\ 


NON-RIEMANNIAN  DYNAMICS 


111 


calculations  may  leave  out  the  sections  dealing  with  physical  inter¬ 
pretations. 

As  expected,  if  the  various  terms  in  the  generalised  Equation  of  Mo¬ 
tion  are  reinterpreted  as  physical  quantities  (flux-density,  vector-po¬ 
tential,  etc.)  the  new  forms  are  nothing  else  but  the  Field  Equations  of 
Maxwell  generalised  for  moving  bodies  and  moving  coordinate  axes.  It 
is  intended  to  treat  the  various  forms  of  the  Field  Equations  (wave, 
impulse-energy,  etc.)  in  another  publication  and  to  interpret  physically 
the  various  curvature  quantities  (K«0,  K)  and  tensors  (stress-energy, 
electromagnetic,  etc.)  as  they  apply  to  rotating  electrical  machinery. 

d.)  li  has  been  most  gratifying  to  find  that  while  other  methods  require 
often  many  months  of  prepartUion  and  thought  and  calculation  to  set  up, 
even  the  steady-state  performance  of  new  types  of  machines  (and  that  can 
be  done  usually  only  by  a  few  engineers  experienced  in  such  mental  disci¬ 
pline)  the  dynamical  method  presented  in  this  paper  gives  identical  results 
by  a  routine  calculation  in  only  a  few  hours  and  it  can  be  employed  by 
any  one  who  knows  how  to  multiply  matrices  and  find  their  inverse.  Even 
in  cases  where  other  methods  are  practically  helpless  as  in  case  of  most 
machines  with  asymmetriccU  windings  or  magnetic  structure  and  unbalanced 
impressed  voltages  or  in  cases  where  various  types  of  machines  are  inter¬ 
connected  in  any  arbitrary  manner,  this  method  gives  correct  results  with 
equal  facility. 

Other  advantages  of  this  method  are: 

1. )  In  learning  the  performance  calculation  of  one  machine  at  the 
same  time  the  performance  calculation  of  all  other  machines  is  learned. 

2. )  First  the  transient  performance  of  machines  during  acceleration 
is  analyzed  The  transient  performance  calculation  with  the  speed 
maintained  constant  and  the  steady-state  performance  calculation 
follow  as  special  cases. 

3. )  The  fundamental  equations  derived  for  the  transient  ancUysis  of 
rotating  electrical  machinery  are  identical  with  the  fundamental  equations 
of  Dynamics  and  of  multidimensional  Differential  Geometry,  hence  the 
researches  of  these  sciences  can  be  applied  to  the  study  of  electrical 
machinery. 

4. )  In  intnKlucing  any  future  refinements  into  the  analysis,  for 
instance  the  effect  of  space-harmonics,  multiple  phases,  slot-openings, 
brush-currents,  etc.  all  fundamental  equations  set  up  in  this  paper 
remain  unchanged,  only  the  value  of  the  constants  to  be  substituted 
into  the  equations  are  to  be  changed. 

5. )  A  powerful  mathematical  tm)!  is  acquire<l  which  can  be  used  not 


k 


112 


GABRIEL  KRON 


only  in  the  study  of  rotatinK  machinery,  which  is  the  case  with  all 
other  t<K)l8,  but  immediately  can  be  applied  in  the  most  advanced 
studies  of  mathematical  physics. 

•.)  (In  a  previous  paper  the  sudden  short-circuit  performance  with 
the  speed  maintained  constant  and  the  steady-state  performance  have 
been  analysed  in  (treater  detail  from  purely  phyairal  considerations  in 
the  vector  and  dyadic  notation  of  Gibbs.  In  a  second  paper  small 
transient  variations  in  speed  and  steady  hunting  have  been  covered 
fn)m  the  same  point  of  view.  Important  labor-saving  devices  such  as 
X-matrices,  complex  vectors  etc.  have  been  also  introduced  in  them.) 

Anyone  interested  only  in  the  dynamiccd  aspect  of  the  theory  of 
electrical  machinery  may  consult  the  following  sections:  II,  III,  IV^  V, 
VII,  VIII,  IX,  XXIII,  XXIV,  XXX. 

THE  REPRESE.NTATIVE  MACHINE  WITH  MOVING  COORDINATE  AXES 

/.  Hyper-compUx  Sumbers 

a.)  The  quatUities  the  Absolute  Calculus  deals  with  are  generalizations 
of  the  complex  number  A  +  jB.  A  more  compact  notation  for  complex 
numbers  is  {A,  B).  A  three-dimensional  complex  number  is  iA  +  jB 
-h  kC,  which  may  also  be  written  (A,  B,  C).  An  evident  extension 
is  to  supply  each  number  with  two  indices  as  Ajj  +  Bjk  -1-  Ckj  +  Dkk 
-b _  This  may  be  den<)ted  by  the  compact  form 

A, /f,  .... 


called  “matrix.”  Set  of  numbers  may  have  three  or  more  indices. 

\  scalar  is  represented  by  a  single  number,  a  vector  by  a  set  of  n 
numbers  (or  functions)  arranged  in  a  n)w,  a  dyadic  by  a  set  of  n*  num¬ 
bers  arrangeil  in  a  square  (some  of  them  may  be  zero),  a  triadic  by  a 
set  of  fi*  numbers  arranged  in  a  cube  etc.  In  general  they  will  be 
called  ‘*polyadics” 

Just  as  in  steady-state  a-c  theory  all  quantities  in  the  equations,  say  Z, 
stand  for  a  complex  number  r  -|-  jx,  similarly  in  the  equations  of  this  paper 
all  quantities  stand  for  anyone  of  the  above  hyper-complex  numbers. 

b.)  A  component  of  a  vector  along  axis  d  is  denoted  by  an  upper  or 
lower  index  as  .4^  or  Aj.  One  term  of  a  dyadic  is  denoted  as 


NON-RIEMANNIAN  DYNAMICS 


113 


A**,  A/  or  The  order  or  position  of  the  indices  can  not  be  inter¬ 
changed,  since  and  Ai*  belong  to  different  sets  of  n*  quantities. 
One  term  of  a  triadic  is  written  as  Ai^  or  Ai^  etc. 

An  equation  such  as  »  S  represents  an  equation  along  axis  a 
f 

where  the  index  0  assumes  all  the  possible  values  of  the  indices  repre¬ 
senting  the  various  axes  d,  q,  f,  etc.  There  is  one  big  difference  between 
this  notation  and  the  usual  scalar  equations  hou'ever.  While  with  the 
usual  scalar  equations  each  coordinate  axis  has  a  different  eqiuUion,  in 
tensor  notcdion  the  equations  for  all  coordinate  axes  are  identical.  The 
index  a  is  simply  replaced  in  turn  by  d,  q,  f,  etc.  for  the  various  axes. 

It  should  be  noted  that  the  indices  to  which  the  summation  sign 
applies  appear  twice  in  the  same  term  (once  as  an  upper  and  once  as  a 
lower  index).  It  is  an  accepted  convention  to  dispense  mth  the  summa¬ 
tion  sign  and  to  write  the  above  equation  as  e.  =  Z^j*.  If  a  is  say  d, 
then 

Z«5t*  *  Z*(<''  -|-  Zw,i*  -1-  -4-  .... 

The  summation  sign  may  occur  twice  or  more  in  the  same  term.  For 
instance  ££  R^i"i^  = 

a$ 

It  should  be  noted  that: 

1. )  The  two  indices  that  are  written  with  the  same  letter  are  called 
“dummy”  indices,  the  others  “free”  indices. 

2. )  Any  letter  may  be  used  in  the  same  term  as  a  dummy  index. 
For  instance  Fasyi*  *  r.,.,?'. 

3. )  One  of  the  dummy  indices  must  be  an  upper,  the  other  a 
lower  index. 

4. )  Since  most  equations  of  this  paper  are  vector  equations,  in  every 
term  of  the  equations  only  one  of  the  indices  will  be  a  free  index.  This 
index  must  be  denoted  by  the  same  letter  in  each  term  of  the  equation 
and  it  must  be  an  upper  or  a  lower  index  in  all  terms. 

II.  The  Two  Representative  Machines 

a.)  The  most  general  rotating  machine  consists  of  two  structures: 

1. )  A  stator  having  asymmetrical  magnetic  structure  and  several 
layers  of  asymmetrical  windings  arranged  in  any  manner  whatever. 

2. )  A  rotor  (not  necessarily  smcnUh)  with  several  layers  of  windings 
arranged  in  any  manner  whatever. 

It  is  assumed  that  all  brushes,  sliprings,  tape  and  connections  are 
removed  fn»m  both  structures. 


114 


GABRIEL  KRON 


b. )  Two  repreterUative  machinet  will  be  assumed,  each  with  different 
coordinate  axes. 

In  the  first  machine: 

1. )  Flach  stator  winding  has  its  own  axis  as  a  Mationary  coordinate  axis. 

2. )  F^h  rotor  winding  has  its  own  axis  as  a  mooing  cfK)rdinatc  axis, 
all  axes  moving  together  with  the  rotor. 

This  machine  will  serve  as  the  representative  machine  for  which  the 
fundamental  d>'namical  equations  will  be  derived. 

In  the  second  machine; 

1. )  The  stator  coordinate  axes  are  stationary  and  arc  located 
anywhere. 

2. )  The  n)tor  c(K>rdinate  axes  are  also  stationary  and  arc  located 
anywhere. 

All  formulae  calculated  for  the  first  machine  will  be  transformed  to 
apply  to  the  second  machine  and  this  second  machine  with  stationary 
coordinate  axes  will  serv’e  as  the  representative  machine  from  which 
the  equations  for  all  other  ndating  machines  will  be  derived  by  a  trans¬ 
formation  of  coordinate  axes. 

c. )  The  duplication  of  representative  machines  is  necessary  for  the  fol¬ 
lowing  reasons: 

1. )  The  derivation  of  the  fundamental  equations  from  the  Lagrangian 
equation  is  possible  only  for  the  machine  with  moving  rotor  coordinate 
axes,  because  an  axis  must  be  connected  to  the  moving  conductors  to  serve 
as  a  true  Lagrangian  coordinate  <uis.  The  only  way  to  set  the  equations 
up  for  other  axes  is  by  the  pn>ce88  of  “transformation  of  coordinate  axes.” 

2. )  The  routine  calculation  process  of  transformation  for  any  particu¬ 
lar  machine  is  simplest  if  the  representative  machine  has  statiotuiry 
mtor  coordinate  axes  because  all  its  inductances  (that  is  all  coefficients 
of  its  metric  tensor)  arb  constants  and  the  troublesome  cos  6  and  sin  0 
terms  are  eliminated  in  most  cases. 

d. )  Both  representative  machines  have  an  additional  coordinate  axis  t 
along  the  shaft  at  right  angles  in  space  to  all  other  axes,  to  represent 
the  direction  of  all  mechanical  vectors  like  rotor  angular  velocity,  angu¬ 
lar  acceleration,  torque,  etc. 

e. )  The  following  notation  of  axes  will  be  used: 

1. )  Anyone  of  the  ctairdinate  axes  of  the  representative  machine 
with  moving  c(x>rdinate  axes  will  be  denoted  by  ib,  m,  n  .... 

2. )  Anyone  of  the  axes  of  the  machine  with  stationary  c<x)rdinate 
axes  will  be  denoted  by  vpar  .... 


NON-RIEMANNIAN  DYNAMICS 


115 


3.)  In  general  the  coordinate  axes  of  any  machine  will  be  denoted  by 
a,  0,y,6  .... 

Individual  axes  w’lll  be  denoted  by  q,  d,  b,  a,  t  .. .. 

III.  The  Te«t  or  Design  Constants 

In  order  to  calculate  the  performance  of  any  machine,  it  is  necessary 
to  know  tiro  sets  of  quantities  either  tested  on  an  actual  machine  or 
calculated  from  the  design. 

1.)  The  first  set  consists  of  the  resistances  of  all  windings  and  the 
frictional  resistance  of  the  shaft  (if  any).  The  numbers  are  all  con> 
stants  and  can  be  arranged  along  the  diagonal  of  a  square  array, 
(matrix).  It  is  denoted  by  a  symmetrical  dyadic  (equ.  1). 


q  d  *  b  a  t 


9 

0 

0 

0 

0 

d 

0 

ru 

0 

0 

0 

R,a  “  b 

0 

0 

»'*» 

0 

0 

a 

0 

0 

0 

r.. 

0 

t 

0 

0 

0 

0 

r,, 

2.)  The  second  set  of  numbers  represents  the  self  and  mutual  induce 
lances  of  all  windings  along  the  axes  considered,  and  the  moment  of 
inertia  of  the  rotor.  The  inductance  of  all  rotor  windings  is  expressed 
as  a  function  of  the  rotor  angular  displacement  6  ^  x*.  The  set  is 
represented  by  a  square  array  and  is  denoted  by  a  symmetrical  dyadic 
Ltf.  This  dyadic  will  be  called  the  “metric  tensor”  (equ.  2). 


9 

d 

b 

a 

t 

^  9 

(fl) 

L,,  (0) 

0 

(0) 

Ls.  (») 

0 

II 

• 

U,  ie) 

Lu  (0) 

L|4  (0) 

0 

a 

L,,  (0) 

L^(0) 

L^^e) 

0 

t 

0 

0 

0 

0 

K 

The  knowledge  of  these  two  sets  of  numbers  and  L^s  is  sufficient  to 
find  the  transient  and  steady-state  performances  of  any  machine  assuming 
no  magnetic  saturation  and  no  iron  losses. 


116 


GABRIEL  KRON 


IV.  The  Ideal  Repreeentative  Machine 

a. )  In  all  equationa  developed  below  the  value  of  having  terms 
with  any  function*  of  9  can  be  (mbatituted  and  the  performance  can  be 
found  by  a  routine  calculation.  That  ia  a  method  of  attack  is  given  in 
these  pages  by  which  the  effect  of  space  harmonics  (slot  openings,  m.m.f. 
harmonica,  etc.)  on  the  performance  can  be  evaluated. 

At  this  point  however,  it  is  not  intended  to  work  out  a  complete 
problem  on  harmonics;  that  is  reserved  for  a  future  paper.  Examples 
will  be  worked  out  in  which  the  assumptions  as  regards  to  the  variations 
of  the  inductances  with  0  are  the  same  as  the  most  advanced  treatises 
on  machine  analysis  have  already  used  (e.g.  in  case  of  the  salient-pole 
alternator  etc.)  in  order  to  show  that  the  method  of  this  paper  gives 
identical  results  in  all  cases  worked  out  by  other  methods  with  by  far 
less  mental  and  physical  labor. 

b. )  An  ideal  rotating  machine  is  one  in  which;  (Fig.  1.) 


1. )  The  rotor  windings  are  symmetrically  distributed  around  the 

smooth  rotor.  , 

2. )  The  stator  has  field-poles  on  it  and  asymmetrical  windings 
anywhere. 

3. )  The  variation  of  rotor  self  and  mutual  inductances  with  6  follow’s 
sine  curves  as  shown  below. 

c.)  In  the  body  of  the  paper  the  following  special  case  will  be  fol¬ 
lowed  through; 

1. )  One  layer  of  winding  exists  on  the  stator  with  axes  along  q  and 
d  and 

2. )  One  layer  of  winding  exists  on  the  rotor  with  mutually  perpen¬ 
dicular  axes  a  and  b  at  an  angle  9  from  d  and  q. 

(A  more  general  example  would  be  to  assume  more  than  two  sets  of 
axes  on  the  rotor,  for  instance  three  sets  at  120  degrees  apart  as  it  is 
in  a  three-phase  synchronous  machine  and  also  additic^al  inductances, 


i 


NON-RIEMANNIAN  DYNAMICS 


117 


like  zen)  phaae-sequence  reactances.  Since  they  form  a  special  case  of 
a  general  treatment  of  space-harmonica  that  is  intended  to  be  attacked 
systematically  later,  this  isolated  example  will  not  be  followed  through 
here,  though  all  equations  developed  are  equally  valid  for  it.) 

The  value  of  the  metric  tensor  for  this  particular  machine  is 


d, 

m 

a 

L..  «  h 

Q. 

t 


where  Li  *  (L,,  +  Lrd)l2  and  L*  —  (L,,  —  Lrd)/2.  The  subscript  « 
refers  to  stator,  r  to  rotor,  d  to  direct-axis  and  q  to  quadrature-axis 
quantities.  L  represents  self-inductance  and  M  mutual  inductance. 
Lrd  represents  the  maximum  value  of  the  self-inductance  of  a  rotor 
winding  when  its  axis  is  along  the  field-poles  (direct  axis). 

The  values  of  are 


J. 

"6 

Rm.  -  o 

d. 


d.)  For  the  representative  machine  with  two  layers  of  windings  on 
the  stator  and  two  on  the  rotor  Lm*  is  given  in  table  I.  Some  of  the 
other  derived  constants  of  this  machine  are  also  given  in  the  table  to 
be  used  in  the  analysis  of  machines  whose  connection  diagram  is  given 
in  tables  II  and  III.  In  the  body  of  the  paper  the  simpler  machine 
will  be  followed  thiough  since  it  is  also  identical  with  the  two-phase 
salient-pole  alternator  with  amortisseur  winding.  It  should  be  noted 
however  that  these  simpler  matrices  in  the  body  of  the  paper  can  be 
found  fmm  those  given  in  the  table  simply  by  cancelling  the  rows  and 
columns  that  belong  to  the  second  layers  of  the  stator  and  rotor.  That 
M  all  general  eipresnons  L^St  ^tc.  refer  to  those  given  in  the  table. 
The  simpler  expressions  in  the  body  of  the  paper  serve  only  as  illustra- 
tif)ns  that  can  be  easily  checkwl. 


q,  b  "  a  d,  t 


0 

0 

0 

0 

0 

r. 

0 

U 

0 

0 

0 

r. 

0 

0 

0 

U 

0 

U 

0 

0 

0 

0 

r„ 

d,  a  *  b  Q,  I 


Mi  cos  6 

—  Mi  sin  9 

Mi  cos  6 

sin  2  9 

M  ^  sin  9 

—  Mi  sin  9 

Lj  2  9 

Lx -¥  Lt  cos  2  9 

M  ,  cm  9 

0 

M  ^  sin  9 

M^  cos  9 

0 

0 

0 

0 

na.il 


NON-RIEMANNIAN  DYNAMICS 


121 


V.  The  EtfueUion  of  Motion 

a. )  Let  in  the  representative  machine  with  mooing  coordinate  axes 
the  following  quantities  be  defined; 

1. )  the  total  number  of  chargee  that  passed  through  any  winding, 
counted  from  a  definite  time  and  the  instantaneous  angular  displace¬ 
ment  of  the  rotor,  measured  frr»m  the  center  of  the  field-pole  are  denoted 
by  X*  where  k  may  have  any  value  q,d,  ...  t.  (x*  is  also  denoted  as  $). 

2. )  the  value  of  the  instantaneous  current  in  each  winding  and  the 
instantaneous  angular  velocity  of  the  rotor  are  t*  »  di^/dt 

3. )  the  instantaneous  terminal  voltage  applied  to  any  winding  and 
the  instantaneous  applied  shaft  torque  are  e*. 

The  instantaneous  stored  kinetic  energy  is  T  »  (l/2)L«,»"t"  con¬ 
sisting  of  the  sum  of  the  magnetic  and  the  mechanical  kinetic  energy. 
The  instantaneous  dieeipation  function  (one-half  of  the  dissipated  power) 
is  f  »  (l/2)/f»,t"i".  The  potential  energy  of  the  machine  is  zero. 

In  a  general  dynamical  system  several  types  of  coordinates  may 
exist,  geometrical,  elastic,  magnet ic,  etc.  In  rotating  electrical  machinery 
two  types  of  coordinates  occur,  one  geometrical  coordinate  8  and  n  —  1 
electrical  coordinates  (charges).  Since  electrical  coordinates  differ  in 
many  respects  fn)m  geometrical  ones  the  resultant  dynamical  equations 
are  special  cases  of  the  usual  equations  of  Particle  Dynamics  allowing 
special  treatments.  (Electrical  coordinates  are  also  called  "cyclic” 
coordinates.) 

b. )  The  Equation  of  Motion  of  Lagrange  for  a  system  containing  kinetic 
energy  and  dissipation  is 


Substituting  the  values  of  T  and  F 


122 


GABRIEL  KRON 


Also 


rf(L^r)  ,  dr  .  dUk  , 


,  dr  dUa  dx" 

"■*  "jT  '  -  .  “37  • 

d<  dx”  d( 


SiibAtituting 


e* 


-H 


9Lmk 

ax* 


rr. 


A?«*r  + 


/  4- 


(dUt 

\  dx* 


1  di[<aiia\ 

2  lx*"/ 


rr* 


6 


In  this  equation  it  is  customary  to  divide  the  first  term  in  parenthesis 
into  two  components  by  interchanging  the  indices  m  and  n,  so  that 


dLmk 

dx* 


rr 


1  /  dLmk 


-I- 


7 


The  expression  in  the  parenthesis  of  equ.  6  is  a  triadic,  it  trill  be  denoted 
by  [tnn,  Jt)  and  called  the  “Christoffel  symbol  of  the  first  kind" 


(m  n,  itj 


1  / dlimk  ,  bLnk  _  dL,iii\ 

2  \  dx*  dx"  dx*  / 


8 


Hence  the  F^uation  of  Motion  of  the  representative  machine  is 

c*  -  /f-*r  -H  L«*  +  [mn,  *1  rr . 9 

at 


c.)  It  trill  be  assumed  that  each  term  of  this  equation  trill  be  represented 
by  the  same  symbols  in  any  coordinate  system,  that  is  the  E^quation  of 
Motion  will  be  assumed  to  be  represented  in  any  coordinate  system 
by  replacing  [mn,  ifc]  by  Tgy,  •  representing  any  function  of  L,^ 


dr 

Hmd  r  Lmf  +  rty,a 


10 


V/.  Calculation  of  Tmn.k 


a.)  It  will  be  found  advantageous  to  define  Pm.,  k  for  the  representa¬ 
tive  machine  just  as  it  follows  from  the  dynamical  equation  6 


dL, 

dx" 


1  ^ 
2  dx* 


11 


instead  of  as  in  equ.  8.  Both  definitions  of  r.,,  k  trill  give  identical 
voltages  and  torques  in  all  coordinate  systems  but  with  the  letter  definition 


NON-RIEMANNIAN  DYNAMICS  123 

the  actual  calculation  of  voltaftes  is  cut  into  two  and  the  physical 
definitions  are  expressed  in  simpler  forms 
b.)  In  calculating  r«,.  k  from  equ.  11  it  should  be  noted  that  one  of 


the  indices  must  be  t  since  L..  is  a  function  of  x*  *  $  only.  That  is 
from  equ.  11 

1. )  if  „  =  <  IV*  «  . 12 

2. )  if  Jk  -  1  i  . 13 

3. )  if  m  *  1  r,..*  *  0 . 14 


Differentiating  every  term  of  as  given  in  equ.  3  with  respect 
to  $  then 


NON-RIEMANNIAN  DYNAMICS 


125 


For  instance  Tk.  ^  sin  $,  T*  i  “  0. 

c.)  H’tlA  the  aid  of  the  txto  divisions  of  T*,.  *  into  r«i,  »  and  r«,.  ,  the 
Equation  of  Motion  can  be  divided  into  tu<o  components,  (^e  is  the 
Equation  of  Voltage  found  by  allowing  the  free  index  assume  any 
value  but  t 


e* 


at 


d„  a,b,  q,) - 17 


\ 


representing  the  voltage  equation  of  each  coordinate  axis.  The  other 
component  is  the  Equation  of  Torque  found  by  allowing  the  free  index 
assume  only  the  value  t 

e,  -  +  L„~  +  r^n.,rr . is 

at 


where  the  torque  developed  by  the  machine  is 


Torque  »  T,,.  tt*t* 


19 


For  instance  ii  k  q,  than  the  instantaneous  voltage  in  the  stator 
along  axis  q,  is  found  from  6qu.  17 

e^  “  Rf,  ,.t**  +  Af,sin  ^  -b  A/,  cosff^  +  t*  3f,  cos  9i*  —  t*  A/,sin  9iK 

d.)  If  the  labor-saving  device  is  not  followed  and  r.», «  is  defined  as 
the  sum  of  three  components  as  in  equ.  8  than  ^  is  expressed  in 


three  matrices 

1  A/ 

!•)  rLt.k  *  2  3^  “  . ^ 

2. )  F'l,  *  *  ^  «  one  half  of  equ.  15 . 21 

2  dx"  ' 

3. )  r:..,  -  -l?^-.equ.l6 . 22 

With  this  definition  the  Equation  of  Voltage  is 

e*  -  R^r  -b  ^  +  r(«4,.*  rt‘ . 23 

at 


126 


GABRIEL  KRON 


where 

r(M<),*  “  “  r«i.  *  r«».  * . 24 

The  matrix  of  HaK),*  ia  equ.  15,  that  ia  the  aame  aa  that  of  P.i,  *. 

e.)  Alao  it  ahould  be  noted  that  equ.  11  could  have  been  de6ned  aa 
Pm..  *  inatead  of  P..,  *.  With  thia  definition  equ.  15  would  have  been 
denoted  aa  P,.,  »  inatead  of  P.t, 

THE  REPRE8ENTATIVE  MACHINE  WITH  STATIONARY  COORDINATE  AXES 

VII.  The  Traneformation  of  CoordituUe  Axes 

A.)  Let  a  new  representative  machine  be  assumed  in  which  the  rotor 
coordinate  axes  are  stationary  alonfc  the  direct  and  quadrature  axes 
d,  and  q,.  Then  the  old  variables  z*  and  z*  are  replaced  by  new  variables 
r*'  and  z*',  while  the  variables  x^,  z**  and  z‘  remain  unchanged. 

In  order  to  set  up  a  relation  between  the  new  and  the  old  variables 
if  possible,  the  variation  of  the  mutual  inductance  between  the  new 
and  the  old  ctNirdinate  axes  will  be  examined. 

Returning  to  equ.  3  the  mutual  inductance  between  axes  a  and  d, 
ia  M4  cos  0  where  is  the  value  of  the  mutual  inductance  when  axis  a 
is  opposite  axis  d,.  Hence  the  mutual  inductance  between  axes  a  and 
d,  is  assumed  as  Lrd  cos  9  where  L,4  is  the  self-inductance  of  axis  a 
when  it  coincides  with  axis  d,. 

Similarly  the  mutual  inductance  between  axis  b  and  d,  is  assumed 
as  —  Lrd  sin  0. 

b.)  Hence  the  following  identity  exists  between  the  flux-linkages  set 
up  at  axis  d,  by  currents  at  the  new  and  by  currents  at  the  old  coordi¬ 
nate  axes 

*  i^Lrd  cos  0  —  i^Lrd  sin  0 

since  i*'  produces  no  flux-linkages  at  axis  d,. 

The  relation  between  the  currents  is  found  by  cancelling  Lrs 

I*''  »  I*  cos  0  —  i*  sin  0 . 25 

Following  similar  reasonings  with  respect  to  the  flux-linkages  set  up  at 
axis  Qr  by  currents  at  the  new  and  at  the  old  cinuxiinate  axes 

fV  as  sin  0  -f  i*  cos  0 

The  relations  between  the  differentials  of  the  new  variables  dz'  and 

\ 


NON-RIEMANNIAN  DYNAMICS 


127 


the  old  variables  rfjr"  are  found  fnun  the  two  current -equations  by 
cancelling  di 


cos  d  djf  —  sin  6  dx* 


26 


dx*' 


sin  6  dx“  +  cos  d  dx* 


27 


C.)  The  roeffirients  of.  the  differentials  can  be  represented  in  a  square 
matrix  railed  the  “transformation  tensor”  and  denoted  by  C^.  Since 
I**,  f*'  and  »*  remain  unchanged  rjj  «  I  etc.  That  is  the  transforma¬ 
tion  tensor  Cl  is 


d. 

d,  »  q. 

9. 

t 

1 

0 

0 

0 

0 

0 

cos  6 

sin  9 

0 

0 

0 

—  sin  9 

cos  9 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

(Since  the  determinant  is  unity,  the  transformation  is  called  “orthogo¬ 
nal.”)  Hence 

dx>  «  C:  dx"^ . 29 


Also 


f  -  CL  I- 


30 


As  an  example  the  current  along  the  new  axis  dr  is  found  from  its  values 
along  the  old  axes 

«  CJ:  f  +  Ci'  C  +Cr  I*  +  Cj;  i*'  -  cos  t'  -  sin  t\ 

The  values  of  the  other  vectors  and  polyadics  along  the  new  coordinate 
axes  are  also  found  with  the  aid  of  the  transformation  tensor  CL  from 
their  values  along  the  old  coordinate  axes,  but  their  formulae  of  trans¬ 
formation  are  different  from  that  of  t*  in  equ.  30. 

VIII.  Quasi-holonomic  TransformcUions 

a.)  A  very  important  point  is  to  note  that  while  an  equation  was  set  up 
between  the  differentials  of  the  old  and  the  new  variables,  no  equation  can 


128 


GABRIEL  KRON 


be  set  up  between  the  variables  themselves  «uch  as 
(’onsequently  equ.  26  can  not  be  written  a« 


dx^' 


djf  +  dx* . 
dj-  di* 


. 31 


If  this  last  equation  is  assumed  then 


di- 


cos  B 


32 


and  the  following  contradiction  results 


r 

</L/  ^ 
1?” 


sin  B 


dY' 

dJ-dx* 


aY' 

dx‘dx‘ 


%/^4r 

(Tv  I 

dx* 


0. 


Hence  C't'  can  not  be  written  as  dj^',  dx*,  in  the  expression  dCi,  dx" 
the  indices  m  and  n  can  not  be  interchanged,  the  differential  equation  26 
can  not  be  integrated  and  the  function  /(x^',  x*  ...)  *  0  does  not 
exist.  For  this  reason  dx*',  dx"'  are  not  exact  differentials  and  conse¬ 
quently  the  treatment  of  rotating  electrical  machinery  differs  from  that 
of  other  dynamical  systems  with  true  Lagrangian  coordinates  (called 
“holonomic”  dynamical  systems)  where  always  an  equation  can  be  set 
up  between  the  old  and  the  new  variables  themselves. 

b.)  Howeter  the  treatment  of  electrical  machinery  differs  also  from  that 
of  a  general  non-hoUmomic  dynamical  system  due  to  the  presence  of  both 
geometrical  and  electrical  coordinates  which  influences  the  forms  of  the 
transformation  tensor  and  of  the  metric  tensor  the  following  way: 

1. )  The  coefficients  of  the  transformation  tensor  equ.  28  are  either 
constants  or  are  functions  of  x'  =  only.  (In  general  they  are  func¬ 
tions  of  all  the  old  coordinates.) 

2. )  The  coordinate  x*  always  remains  unchanged,  that  is  x*  can  be 
considered  as  an  old  and  also  as  a  new  variable.  Some  of  the  other 
coordinates  also  remain  unchanged. 

3. )  The  metric  tensor  is  a  function  of  x‘  only,  similarly 
(In  general  they  are  functions  of  all  the  old  variables.) 

Due  to  this  special  form  the  number  of  unknowns  (dependant  variables) 
in  the  transformed  equations  is  n,  that  is  the  same  as  the  number  of 
dynamical  equations.  Before  the  transformation  the  variables  are  x' 
and  the  n  —  1  currents  dx"‘/dt  (or  the  n  —  1  electrical  coordinates  x") 
after  the  transformation  the  variables  are  x‘  and  the  n  —  1  new  cur¬ 
rents  dx’/dt  or  X'.  (In  case  of  general  non-holonomic  dynamical  systems 


NON-RIEMANXIAN  DYNAMICS 


129 


,the  number  of  variablee  in  the  n  transformed  dynamical  equations  is  tn. 
They  are  the  n  old  coordinates  occurring  in  Lms  and  Cl  and  the  n  new 
differentials  or  velocities.  The  n  equations  of  transformations  supply 
the  additional  equations.) 

Since  in  the  transformed  equations  the  new  variables  can  be  calcu¬ 
lated,  the  expression  *‘non-holonomic  coordinates”  is  justified,  even 
though  in  case  of  general  non-integrable  transformations  the  expression 
is  meaningless  and  only  “differentials  of  non-holonomic  coordinates”  is 
admissible.  In  other  words  “electrical  non-holonomic  coordinates”  do 
exist  but  “geometrical  non-holonomic  coordinates”  do  not. 

Following  a  suggestion  of  Ix)rentz  who  calls  the  83r8tem  of  electric 
charges  moving  in  a  ponderable  body  a  “quasi-holonomic”  system 
(although  being  non-holonomic  it  is  solved  as  if  it  were  holonomic) 
rotating  electrical  machinery  and  similar  dynamical  systems  will  be  called 
quasi-holonomic"  dynamical  systems  since  they  are  soluble  as  if  they 
were  holonomic  dynamical  systems. 

c.)  With  respect  to  the  electrical  c<M)rdinates  the  matrix  of  trans¬ 
formation  appears  as  containing  only  constant  terms  since  8  is  not  trans¬ 
formed.  Hence  the  inverse  transformation  tensor  Cl  is  found  simply 
by  calculating  the  inverse  of  the  matrix  of  C'.  (In  the  general  case 
where  C*  **  dr'/dx"  first  the  old  c<x)rdinates  must  be  expressed  in 
terms  of  the  new  coordinates  and  then  only  can  Cl  be  calculated  by 
differentiation.)  The  inverse  transformation  tensor  is 


Cl 


d. 

a  “ 

b 

9. 

t 

d. 

1 

0 

0 

0 

0 

dr 

0 

cos  8 

-sin  8 

0 

0 

<lr 

0 

sin  8 

cos  8 

0 

0 

9. 

0 

0 

0 

1 

0 

t 

0 

0 

0 

0 

1 

33 


IX.  The  Connection  Tensor  of  Machines 

a.)  Just  as  the  representative  machine  with  stationary  coordinate 
axes  differs  from  that  with  moving  c(M)rdinate  axes  only  by  a  trans¬ 
formation  or  connection  tensor  by  the  aid  of  which  all  polyadics  L»., 
r...,  k  etc.  are  transformed  into  r„,  r  etc.  according  to  certain 
rules  <leveloped  later,  similarly  all  other  rotating  machines  differ  from 
the  representative  machine  urith  stationary  coordinate  axes  by  a  trans¬ 
formation  tensor  Cl.  Each  machine  has  its  own  transformation  or 


130 


GABRIEL  KRON 


I 

I  connection  tensor,  which  in  fact  is  nothing  else  but  a  mathematical  reprC'^ 

^  senUUion  of  its  connection  diagram. 

If  the  connection  diagram  of  the  machine  iR  given,  its  connection 
>  tensor  can  be  set  up  immediately  in  a  few  minutes  and  that  found  all 

its  operators  r„.  r  etc.  and  hence  its  performance  can  be  found  by  only 
I  a  routine  calculation,  which  may  be  long  or  short  depending  on  the 

I  complexity  of  the  connection  diagram,  but  without  any  further  analysis 

or  research  into  the  physics  of  the  problem.  In  particular: 

1. )  the  steady-state  performance  of  all  single  and  interconnected 

I  machines  and  the  steady  hunting  performance  are  found  by  arith¬ 

metical  calculations. 

2. )  in  case  of  machines  with  stationary  coordinate  axes  (to  which  by 
far  the  largest  number  of  machines  belong  or  can  be  reduced  to)  the 
performance  calculation  for  sudden  short-circuits  with  constant  speed 
maintained,  and  for  transient  speed  variations  superimposed  upon  a 
steady  state  is  automatically  put  into  a  form  in  which  the  Expansion 
Theorem  of  Heaviside  (which  is  used  mostly  in  stationary  net  work 
analysis)  can  be  immediately  applied  without  the  use  of  operational 
trarisformations. 

In  the  accompanying  table  the  connection  diagrams  and  connection 
tensors  of  a  few  representative  types  of  machines  used  in  industry  are 
given.  They  were  discussed  in  detail  in  the  previous  papers  where 
important  labor-saving  methods  (complex  vectors,  X-matrices,  etc.) 
were  alstt  intnxluced. 

It  should  be  noted  that : 

1. )  When  two  circuits  /  and  g  are  in  series  they  can  be  replaced  by 
h  and  becomes  t*,  also  i*  l)ecome8  i* 

2. )  In  machines  with  sliprings  on  a  rotor  circuit  like  the  Schrage 
motor,  (Fig.  8)  (or  thp  synchronous  converter  (Fig.  9))  where  sliprings 
are  connected  to  the  second  rotor  layer,  the  moving  c(H)rdinate  axes  of 
the  sliprings  Ort,  I>r«  can  be  replaced  by  the  stationary  C(M)rdinate  axes 
d,t  and  g,t  in  practically  all  problems. 

3. )  n  represents  the  ratio  of  turns. 

b.)  The  power  of  the  new  t(H)l  can  be  appreciated  especially  when 
analyzing  a  group  of  machines  interconnected  in  any  manner  whatever 
as  represented  by  a  tensor  Cl.  Every  operator  L^,  r,a, ,  of  the  group 
is  equal  to  the  sum  of  that  particular  operator  of  each  individual  unit, 
transformed  by  C'.  The  group  connection  diagram  and  the  group 
connection  tensor  of  a  few  interconnected  machines  are  given  in  figs.  15, 
16  and  17. 


NON.RIEMANNIAN  DYNAMICS 


131 


In  case  of  balanced  polyphase  machines  with  balanced  impressed 
voltages  only  one  phase  may  be  studied.  To  do  that  all  “q”  indices 
along  the  stator  and  the  brushes  may  be  replaced  by  a  “d"  index  and 
the  corresponding  terms  multiplied  by  —j  representing  a  fundamental 
frequency  time  lag.  Also  all  slip-ring  indices  “b”  may  be  replaced  by 
an  *‘a”  index  and  the  corresponding  term  multiplied  by  —  ib  representing 
a  slip-frequency  time  lag. 

C.)  Rotating  machines  can  be  grouped  according  to  the  form  of  the  matrix 
of  the  connection  tensor  into  three  classes: 

1. )  The  matrix  of  C'  is  square,  that  is  the  number  of  old  coordinates 
is  equal  to  the  number  of  new  coordinates. 

2. )  The  number  of  new  c(x>rdinate  axes  is  less  and  the  matrix  of 
is  rectangular  (also  called  “singular  matrix’'). 

3. )  The  number  of  new  coordinate  axes  is  more. 

In  the  two  latter  cases  the  determinant  of  the  transformation  matrix 
is  aero  and  the  inverse  of  C'  can  not  be  dehned  without  further 
assumptions. 

Rotating  machines  also  can  be  grouped  into  two  classes  according  to  the 
nature  of  the  variables  occurring  in  the  transformation  tensor: 

1. )  Those  connection  tensors  that  contain  no  9  can  be  represented 
as  dx*/bx“  »  C'. 

2. )  Those  that  contain  9  again  represent  non-integrable  transforma¬ 
tions  in  passing  from  the  representative  machine  with  stationary 
coordinate  axes  to  the  particular  machine  and  C'  can  not  be  repre¬ 
sented  as  3/'/dx*  *  dx'/3x*. 

d.)  It  should  also  be  noted  that  from  fig.  2  the  connection  tensor  of 
the  salient-pole  alternator  with  amortisseur  windings,  the  asymmetrical 
unbalanced  induction  motor  (split-phase,  capacitor,  etc.  motors)  the 
single-phase  induction  motor  and  multiple-cage  induction  motor  is  the 
unit  matrix  (having  unity  in  the  main  diagonal  and  zero  everywhere). 
The  final  transient  and  steady-state  formulae  of  these  machines  are 
exactly  identical  in  using  the  method  of  the  paper,  the  difference  is  only 
in  the  type  of  terminal  voltage  {d-c.,  a-c.,  balanced,  unbalanced  etc.) 
that  is  applied  to  them. 

X.  Sinusoidal  Space  Waves 

a.)  The  physical  concepts  so  far  introduced  are: 

1. )  the  charges  and  angular  displacement  x* 

2. )  the  currents  and  angular  velocity  t* 

3. )  the  applied  terminal  voltages  and  torque  e« 


132 


GABRIEL  KRON 


4. )  the  inductaocefl  and  moment  of  inertia 

5. )  the  resistance  and  friction 

6. )  the  flux-linkages  and  angular  momentum 

Besides  these  concepts  no  other  physical  quantity  need  to  be  introduced 
throughout  the  rest  of  the  paper.  All  other  expressions  that  occur  in  the 
equations  like  r.«,  or  L^gdi^/dt  etc.,  can  be  considered  simply  as 
mathematical  symbols.  This  point  of  view  has  been  followed  more  or 
less  consistently  by  other  writers.  In  this  paper  however  it  is  intended 
to  interpret  physically  every  symbol  and  expression  that  occurs  in  the 
equations  to  facilitate  the  understanding  of  the  meaning  of  the  symbols, 
their  peculiarities,  manner  of  transformation  etc.  In  fact  rotating  elec¬ 
trical  machinery  offer  a  most  clear-cut,  easily  comprehensible  physical 
picture  by  which  the  otherwise  abstract  concepts  of  the  Absolute  Cal¬ 
culus  can  be  immediately  visualised. 


It  will  be  assumed  for  the  sake  of  a  physiccU  interpretation  of  symbols, 
that  at  any  one  instant  during  accelercUion  in  each  layer  of  winding  the 
current-density  and  flux-density  waves  are  sinusoidally  distributed  in  space. 

These  assumptions  were  made  in  the  previous  papers  in  setting  up  the 
fundamental  equations  and  the  equations  derived  there  are  identical 
with  those  derived  here,  though  in  this  paper  the  assumptions  are  only 
the  sinusoidal  variation  of  inductances.  Hence  the  assumptions  are 
legitimate;  besides  the  actual  space  variation  of  quantities  in  most 
machines  is  not  far  from  sinusoidal,  especially  in  the  rotor.  In  any  case 
it  is  only  assumed  for  visualization  that  the  machine  with  sinusoidal 
variation  of  inductances  is  replaced  by  another  machine  with  sinusoidal 
space  variation  of  current  and  flux-densities,  both  machines  giving 
identical  results. 

(b.)  The  current  i^  is  the  current  in  the  winding  surrounding  the 
field  pole  along  axis  d,.  This  current  however  becomes  t**  »  i.q, 
when  considered  as  a  current-density  wave.  That  is  what  here  is  called 
it  was  called  in  the  previous  paper  i.q„  i^  was  called  i.d,  etc.  How¬ 
ever  all  results  of  both  points  of  view  are  identical  if: 


J 


NON-RIEMANMAN  DYNAMICS 


133 


1. )  the  unit  vectors  and  indices  d  and  q  are  interchanged,  also 
a  and  b 

2. )  the  instantaneous  position  of  the  co<jrdinate  axes  used  in  the 
previous  paper  are  given  in  Fig.  18,  while  their  position  as  used  in  this 
paper  is  given  in  Fig.  1.] 

XI.  The  Phyeiral  InterpreUUion  of  Symboh 

a. )  A  two-pole  sinusoidal  space  wave  in  a  winding  like  current  density, 
can  be  represented  by  a  vector  drawn  from  the  center  of  the  rotor 
toward  the  positive  maximum  value  of  the  wave. 

An  n-dimmsional  spare  wave  or  vector  consists  of  two-dimensional  waves 
or  tutors  in  each  layer  of  winding,  with  an  additional  dimension  along 
the  axis  of  the  rotor. 

b. )  In  every  mtating  machine  the  following  n-dimensional  space 
waves  or  vectors  can  be  assumed  to  exist  at  any  one  instant  during 
acceleration: 

1. )  a  current -density  and  velocity  vector  »• 

2. )  a  vector  which  includes  the  flux-linkage  vector  (magnetic 
vector  potential)  and  the  angular  momentum  Li,t' 

3. )  the  rotor  flux-density  vector  r.i.  >»•  ■*  (or  r(ai).  t»*).  Al¬ 
though  the  expression  stands  for  a  dyadic,  still  the  dyadic  can  be  repre¬ 
sented  physically  by  a  space  vector.  This  calls  attention  to  the  fact 
that  two  kinds  of  ph3rsical  vectors  can  be  differentiated  *‘polar”  and 
“axial”  vectors.  The  polar  vectors  like  the  above  two  are  mathe¬ 
matically  represented  by  a  vector  while  an  axial  vector,  like  the  flux 
density  is  mathematically  represented  by  a  dyadic  'ki,. 

In  addition  the  following  voltage  vectors  exist: 

1. )  the  impressed  voltage  vector  and  shaft  torque  c. 

2. )  the  resistance-drop  vector  and  frictional  drop  R^si^ 

3. )  the  vector  Lmgdi^/dt  which  includes  the  induced  voltage  vector 
d^mldt  and  the  time  rate  of  change  of  angular  momentum  L„di*!dt 

4. )  the  vector  which  includes  the  generated  voltage  vector 

^tyi‘  and  the  torque  developed  by  the  machine  The  two  flux 

density  vectors  and  are  not  identical.  (See  section  XXVI.) 

c. )  In  terms  of  these  space  vectors  the  Equation  of  Voltage  (equ.  17) 
can  be  written  as 


c. 


RmSi*  +  ^Imi' 


34 


134 


GABRIEL  KRON 


if  d4fialdt  is  defined  as  and  the  Equation  of  Torque  (equ.  18)  as 

r,  -  ff„/‘  +  Lu  ^  -I-  . :« 

at 

These  equations  are  the  Field  Equations  of  Maxwell  generalized  for 
moving  bodies,  expresseil  in  terms  of  the  magnetic  vector  potential 
(assuming  no  electrostatic  field). 

d. )  Hence  in  each  layer  of  winding  at  all  instant  maximum  four  two- 
dimensional  vectore  can  be  drawn  in  space  always  forming  a  closed 
polygon: 

1. )  the  impressed  voltage  vector,  if  any, 

2. )  the  resistance  drop  vector, 

3. )  the  induced  voltage  vector, 

4. )  the  generated  voltage  vector,  if  any. 

e. )  [In  a  former  publication  the  writer  developed  a  purely  graphical 
steady-state  analysis  of  rotating  machinery  in  which  the  loci  of  all 
currents,  having  circular,  elliptical  or  any  other  shape,  were  found  with 
the  aid  of  these  four  vectors  only  and  the  current  vector,  which  all  do  exist 
actually  inside  the  machine  in  each  layer  (that  is  they  are  measurable). 
The  advantages  of  this  purely  physical,  graphical  theory  compared  with 
other  graphical  theories  are: 

1. )  the  only  formula  used  in  the  construction  of  loci  is  r/x  which 
only  serves  as  scale  of  measurements 

2. )  beside  these  actually  existing  vectors,  practically  no  other  auxiliary 
construction  lines  are  used 

3. )  at  any  desired  point  of  any  locus  the  magnitude  of  all  currents, 
fluxes  and  voltages  actually  existing  in  each  layer  of  winding  can  be 
inunediately  constructed  without  any  auxiliary  lines,  in  their  correct 
space  and  time  phas^  relation  to  each  other,  just  as  they  exist  inside 
the  machine. 

The  results  of  the  graphical  treatment  are  identical  with  those  of 
this  paper. 

It  seems  that  any  eventual  purely  graphical,  transient  theory  of 
rotating  machinery  should  be  built  uprm  the  analytically  and  physically 
sound  principle  of  representing  only  actually  existing  vectors  on  the 
diagram  instead  of  hypothetical  leakage  fluxes,  magnetizing  currents, 
etc.  or  even  any  other  auxiliary  construction  lines.] 

f. )  In  connection  with  the  space  vectors  defined  above  a  physical 
definition  can  be  given  to  the  polyadics  of  this  paper  as  follows: 

1.)  a  dyadic  is  an  operator  which  moves  a  space  vector  fn)m  one  part 


NON.RIEMANNIAN  DYNAMICS 


135 


of  space  into  another  part  by  rotations  and  extensions.  For  instance 
changes  the  current  vector  i*  into  a  flux  linkage  vector 
In  each  layer  of  winding  the  position  of  is  different  from  that  of  i* 
2.)  a  triadic  is  an  operator  which  changes  ttoo  vectors  into  one  vector. 
That  is  r«i.  $  changes  the  current  vector  »•  and  the  velocity  vector  »* 
into  a  generated  voltage  vector  F.i,  ()r  i  changes  two 

current  vectors  »•  and  t*  into  a  torque  vector  e‘  *  P.^. 


THE  THEORY  OF  TENSORS 


XII.  Polyadics  and  Tensore 

One  of  the  aims  of  the  Absolute  Calculus  is  to  investigate  how  a  particular 
polyadic  behaves  when  a  new  coordinate  system  is  introduced. 

Let  a  physical  space  vector,  say  a  generated  voltage  vector,  be  con¬ 
sidered,  whose  components  A*  and  A^  are  known  along  the  moving 
coordinate  axes  b  and  a.  If  stationary  c(M)rdinate  axes  d'  and  q'  are 
introduced  and  the  same  physical  vector  is  constructed  again  from  its 
components  A^'  and  ^4*'  along  the  new  axes,  then  often  the  resultant 
vector  does  not  coincide  with  its  former  position.  The  vector  may  be  even 
lero  in  the  new  coordinate  system,  although  everything  remained  the 
same  except  the  axes  from  which  measurements  are  made. 

If  the  magnitude  and  position  of  a  vector  is  the  same  in  any  coordiAate 
system,  it  is  called  a  “tensor"  of  rank  one.  Similarly  if  the  result  of  an 
operation  by  a  polyadic  is  the  same  vector  even  if  the  polyadic  is  expressed 
in  various  coordinate  systems,  the  polyadic  is  called  a  “tensor”  (or 
“invariant”). 

A  tensor  is  transformed  by  multiplying  each  of  its  indices  by  the  trans¬ 
formation  tensor  Cjj.  For  instance  the  new  coefficients  A^gy  ...  of  a 
tensor  are  found  from  the  old  coefficients  ...  by 


1  m$y . . . . 


A 


.36 


.4  polyadic  which  is  not  a  tensor  is  transformed  into  a  new  coordinate 
system  by  a  more  complicated  formula.  A  great  part  of  the  work  below  will 
consist  of  determining  which  polyadics  are  tensors  and  of  finding  the 
formulae  of  transformation  of  those  that  are  not  tensors. 

So  far  the  only  tensors  established  are  dx*  and  i’  since  their  formulae 
of  transformation  as  given  in  equations  20  and  22  are  those  of  tensors. 

If  the  matrix  of  transformation  is  not  square  but  rectangular  all 
equations  developed  below  and  all  physical  interpretations  are  also 
valid  for  them,  except  the  above  physical  definition  of  a  tensor.  The 


1 


136  GABRIEL  KRON 

new  polyadicH  derived  with  the  aid  of  the  rectangular  matrices  are  called 
“induced”  polyadics 

XIII.  Covariant  Tensors 

a.)  To  efltablish  the  character  of  vector  e«,  let  the  input  power  which 
ia  a  scalar  quantity,  be  P  »  Smi".  The  value  of  the  power  input  ia  an 
invariant  (or  tensor)  that  is  from  physical  considerations  its  value  is 
independent  of  the  coordinate  system  in  which  it  is  measured  just  like 
those  of  the  other  scalar  quantities  of  output,  stored  enenty  etc. 


Hence  in  new  c(M>rdinates  P  »  But  »'  «  Cii"  from  equ.  29. 

Substituting  P  *  e,i’  *  erC*i"  «  e«t".  Hence  e*  *  e,C’  or 

f  r  -  C:  e, . 37 


C^tmparing  this  equation  with  equ.  29  it  is  found  that  which 
transforms  e*  into  the  new  ctK>rdinate  system  is  the  inverse  of  C*  which 
transforms  t”*  into  the  new  coordinate  system.  Hence  to  differentiate 
e«  and  similar  vectors  from  i*  its  index  is  a  lower  index.  The  vectors 
that  transform  by  equ.  29  have  an  upper  index  and  are  railed  “contra- 
tariant”  vectors,  those  that  are  transformed  according  to  equ.  S7  have  a 
Imrer  index  and  are  called  “covariant”  vectors.  Tensors  of  higher  rank 
alsa  may  have  any  number  of  upper  and  lower  indices,  but  for  instance 
and  are  two  different  tensors  with  different  n*  coefficients, 
b.)  Since  the  kinetic  energy  (\^)L^i"i^  is  a  scalar  invariant,  its  deriva¬ 
tive  with  respect  to  i*  that  is  is  a  covariant  vector  Its  deriva¬ 
tive  with  respect  to  t*,  that  is  L^a  if*  a  double  covariant  tensor.  Simi¬ 
larly  the  derivative  of  the  dissipation  function,  a  scalar,  {l^)Rmai“i^ 
is  Rmai^t  a  tensor  and  also  its  derivative  R^a  ia  a  double  covariant  tensor. 

Hence  the  values  of  and  Rmm  in  the  new  coordinate  system  with 
stationary  axes  are  by  the  formula 

L„  -  L„C:C~| . 38 

rf.  dr  •  q,  q,  t 


V 

dr 

q. 

t 


Lu 

0 

0 

0 

Ms 

0 

0 

0 

0 

0 

Lr, 

Af, 

0 

0 

0 

0 

0 

0 

0 

0 

Lu 

39 


NON-RIEMANNIAN  DYNAMICS 


137 


d. 

dr 

9r 

9. 

t 

d. 

0 

0 

0 

0 

dr 

0 

rr 

0 

0 

0 

Rr,  -  qr 

0 

0 

fr 

0 

0 

9. 

0 

0 

0 

'■•f 

0 

t 

0 

0 

0 

0 

r 

c.)  Among  the  space  vectors  so  far  all  but  two  were  found  to  be 
tensors.  It  will  be  shown  now  that  the  remaining  two  vectors,  dt 

which  includes  the  indwed  voltage  and  which  includes  the  gene¬ 

rated  voltage  are  not  tensors,  neither  are  di",  di*  'dt  or  r„g,  ,  and  their 
law  of  transformation  does  not  follow  the  simple  formula  of  equ.  36. 
In  other  words  the  space  vector,  say  di"  has  different  lengths  and  different 
positions  when  measured  from  different  sets  of  roordiruUe  axes,  although 
the  vector  i"  is  an  invariant. 


XIV,  The  Transformation  Formula  of  di" 
a.)  Frf)m  equ.  29  »■"  =  C^i*.  Substituting  this  value  of  i"  into  di" 

di-  -  d(i'C:)  -  d»'C:  +  I'  dx' . 41 

axf 

In  rotating  machinery  is  a  function  only  of  ^  0.  Hence 


di"  -  di'CT  +  I'  ^  dj* 
ox* 


.42 


The  formula  shows  that  the  new  vector  consists  of  the  old  vector  di'Cl 
plus  an  additiotuU  vector  which  is  a  function  of  the  instantaneous  velocity 
etc.  If  both  old  and  new  coordinate  axes  are  stationary  the  additional 
vector  disappears  and  di"  becomes  a  tensor. 

b.)  A  physical  representation  is  given  in  6g.  19.  Ijet  the  current 
vector  in  the  rotor  change  from  OA  to  OB  while  the  axis  of  the  sliprings 
changes  from  Oto  6  d8. 

The  observers  on  the  stationary  axes  measure  the  scalar  value  of  the 
current  before  the  change  along  d,  as  OC  and  after  the  change  as  OD. 
Along  axis  q  they  measure  the  change  in  current  as  EF.  Hence  they 
construct  the  di'  vector  as  AB. 

The  observers  on  the  moving  axes  measure  the  scalar  value  of  the 
current  before  the  change  along  axis  a  as  OG  and  after  the  change  as 


138 


GABRIEL  KRON 


01  >B  OL  infitead  of  OQ,  since  the  axis  itself  from  which  the  measure¬ 
ment  is  made  has  also  moved  and  the  projection  of  OB  on  the  changed 
axis  is  01.  It  is  true  that  the  observers  also  measure  a  change  of  dis¬ 
placement  do,  but  that  does  not  appear  to  them  as  a  change  of  current 
as  measured  by  an  ammeter  (it  will  be  shown  below  that  that  appears 
to  them  as  an  additional  voltage).  Similarly  along  axis  b  the  change 
of  current  is  OH  —  01  ^  HK.  Hence  the  resultant  dt*  as  measured 
from  the  moving  axes  is  constructed  as  AM. 


From  equation  42:' 

1. )  dt*  AM  is  the  vector  measured  from  the  moving  axes 

2. )  d»'C7  “  is  the  vector  measured  from  the  stationary  axes 

3. )  t'di‘(dC7  dx')  *  BX  is  a  vector  whose  magnitude  is  equal  to 
i’dx'  *  OA .  LI/OL  and  which  is  rotated  at  right  angles  to »'  by  dC^ldx'. 

The  vector  XM  remaining  on  the  diagram,  perpendicular  to  AB  is  an 
infinitesimal  of  the  second  order,  (its  magnitude  isdi.dO  »  AB.LIIOL) 
and  would  disappear  from  the  drawing  if  the  angle  dO  and  the  length 
AB  could  be  assumed  small  enough. 

c.)  The  transformation  formula  for  di'^jdt  is 


NON-RIEMANNIAN  DYNAMICS 


139 


Aim  I 

i..  ^  -  t„c;c:  -  C  +  L,.cicifi-  g 

-  L..  ^  c:  +  /.„c;c:iv  ^ . 44 

since  CJ|,Cr  *  4'  —  unit  matrix. 


XV.  The  Transformation  Formula  of  F^.  , 

In  the  Equation  of  Motion  (equ.  9) 

tk  *  KmkX^  +  Lmk  +  r»,,  *1*1" 

(It 

tk  is  a  tensor  of  rank  one,  similarly  Hence  it  follows  that  the 

sum  of  the  remaining  two  vectors  also  must  be  a  tensor.  Since 
Lmkdi’"/dt  is  not  a  tensor,  therefore  F*,.  *»'S*  is  not  a  tensor,  neither 
is  I  aiM,  *• 

The  formula  for  the  transformation  of  F„„.  *  can  be  found  from  the 
fact  that  the  sum  of  .the  above  mentioned  two  vectors  is  a  covariant 
tensor  of  rank  one.  Hence  according  to  equ.  37 

L„  ^  +  F...  >/'  -  ^  +  F«,.  krrj  Ct . 45 

Substituting  the  value  of  Lmkdi^/dt  from  equ.  44 


This  formula  like  every  other  transformation  formula  is  true  for  any 
two  coordinate  systems  and  is  one  of  the  fundamental  formulae  of  the 
Absolute  Calculus.  The  6rst  part  leaves  the  operator  unchanged,  while 
the  second  part  again  is  a  function  of  the  velocity  and  it  disappears  if 
both  sets  of  c<M>rdinate  axes  are  stationary. 


XVI.  The  Calculation  of  F,,.  , 

a.)  The  calculation  of  the  generalized  Christoffel  symbol  for  the 
representative  machine  with  stationary  coordinate  axes  is  made  again 
in  two  steps  by  dividing  F„. ,  into  F,«.  ,  and  F,«,  i. 


140 


GABRIEL  KRON 


1.)  The  value  of  r.i, ,  is  found  from 

r.,..  -  r^.  *  CTCJ  +  L.*  ^  c*, . 47 

ax* 

(C|  is  unity).  As  an  example  r,i. «  will  be  calculated  in  detail 


d.  (i,  ’  qr  q,  '  t 


b.)  If  r,i,  ,  is  assumed  as  an  operator  in  the  old  coordinate  system, 
the  same  value  of  r«i.  k  given  in  equ.  15  is  found  by  the  use  of  equ.  47 
that  is  by 

+  . 52 

The  recalculation  of  F.i,  k  serves  as  a  check  on  the  correctness  of 
r,i. That  is 


142 


GABRIEL  KRON 


'i 


i 


the  sum  of  these  two  matrices  gives  the  matrix  of  equ.  15. 
c.)  The  value  of  r^,  i  is  found  from 


In  sectHin  XXVTI  Pm.  *  is  not  given  by  this  matrix  but  by  another 
identical  with  that  of  P.i, ,  in  equ.  51  with  opposite  signs.  The  dis¬ 
crepancy  however  is  only  apparent.  This  matrix  is  used  only  in  the 
calculations  of  torque,  Pm.  Forgetting  for  the  time  being  about 
the  index  t,  the  expression  for  torque  is  a  homogeneous  quadratic  form. 
That  means  that  only  the  symmetrical  part  of  the  matrix  is  effective, 
that  is  the  matrix  given  in  equ.  56. 

d.)  If  the  labor-saving  device  is  not  followed  and  is  defined  as 
in  equ.  8  then  the  coefficients  of  P/,^ ,  would  be  given  in  three  matrices, 
two  for  the  determination  of  voltages  (instead  of  one)  and  one  for  the 
torque,  as  previously.  The  three  components  of  *  are  given  in 
equs.  20-22. 

1.)  Pi,.,  is  found  from  Pi,,.*  equ.  47.  Hence  P«,,*C7Ct  is  one  half 
of  equ.  50  the  other  term  is  given  in  equ.  49.  Their  sum  is 


2.)  Pj,.  ,  is  found  fn)m  Pj,.*  by  equ.  46.  The  matrix  of  Pj,. 
is  one  half  of  equ.  50  the  other  term  is  however  sen)  since  a  is  not  equal 
to  t.  Hence 


NON.RIEMANNIAN  DYNAMICS 


143 


The  sum  of  the  two  matrices  57  and  58  is  the  matrix  of  r«i.  *  as 
Kiven  in  equ.  51.  Hence  both  definitions  give  the  same  voltages. 
That  is  F((i),  ”  l'(i«).  »**  "  rf*. 

3.)  r',,  I  is  identical  with  «  given  in  equ.  56. 

It  should  be  noted  that  F'l.  ,  is  not  equal  to  Ft,.  ,  or  in  other  words 
Fm.  w  i*  asymmetrical  in  the  indices  c  and  a  in  either  definition  of  Fm,.  ». 

e.)  With  the  definition  of  section  Vie  matrix  51  would  have  been 
denoted  as  Ft.,  ,  instead  of  F,i. 

XVII.  The  Coriolis  Voltage 

a. )  An  interesting  physical  interpretation  can  be  given  to  the  trans¬ 
formation  formula  of  F.^,  y  in  equs.  46  or  47  and  55. 

Let  the  Equation  of  Voltage  (equ.  17)  be  set  up  for  the  represent¬ 
ative  machine  with  stationary  coordinate  axes 

e.  “  R„i’  +  L,.  ^  +  F.«.  ,i*t‘ . 59 

b. )  The  observers  on  the  stationary  cures  measure  two  voltages  due  to  the 
presence  of  flux  lines: 

1. )  An  induced  voltage  L,/ii\''dt  »  chfi,,'dt  in  all  stator  and  rotor 
axes  assuming: 

a. )  the  rotor  conductors  stationary, 

b. )  the  currents  varying. 

2. )  A  generated  voltage  F.i,  in  all  rotor  axes  assuming: 

а. )  the  currents  unvarying, 

б. )  the  rotor  conductors  cutting  the  resultant  rotor  flux-density  waves 
with  a  velocity  i‘. 

c. )  In  changing  over  to  any  machine  in  which  one  rotor  layer  has 
moving  coordinate  axes,  not  the  whole  length  of  the  previous  vector 
Lw/li’Idt  appears  as  an  induced  voltage,  only  a  part  of  it  since  the 
moving  observers  measure  a  smaller  di"  (section  XIV').  The  decrease 
of  the  induced  voltage  vector  appears  however  as  an  additional  gene¬ 
rated  voltage  vector.  That  is  the  moving  observers  measure  three 
voltages: 

1.)  An  induced  voltage  in  all  stator  and  rotor  axes,  assuming: 

a. )  the  rotor  conductors  stationary, 

b. )  the  coordinate  axes  stationary, 

c. )  the  currents  varying. 

This  voltage  is  smaller  than  the  corresponding  voltage  measured  by 
the  stationary  observers. 


'4 

i- 


'f.' 


144 


GABRIEL  KRON 


2. )  A  generated  voltage  in  all  rotor  axea,  assuming: 
o.)  the  currents  unvarying, 

6.)  the  coordinate  axes  stationary, 
r.)  the  rotor  conductors  moving. 

It  is  equal  to  the  corresponding  voltage  measured  by  the  stationary 
observers. 

3. )  A  generated  voltage  in  all  stator  and  rotor  axes,  assuming: 
a.)  the  currents  unvarying, 

h.)  the  rotor  conductors  stationary, 
r.)  the  coordinate  axes  moving. 

Since  the  currents  flowing  into  the  moving  coordinate  axes  arc  con¬ 
stants,  the  flux-density  wave  produced  by  them  also  rotates  and  cuts 
all  stator  and  rotor  windings.  Above  the  moving  conductors  are  cutting 
the  stationary  flux-lines,  here  the  moving  flux-lines  cut  the  stationary 
conductors.  This  latter  voltage  is  not  measured  by  the  stationary 
observers.  Thie  difference  in  the  measured  generated  voltage  is  equal  to 
the  differetice  in  the  measured  induced  voltage. 

It  should  also  be  noted  that  moving  coordinate  axes  are  introduce<l 
not  only  by  sliprings  on  a  rotor  winding,  but  also  by  revolving  brushes 
on  a  stator  winding. 

In  analogy  to  its  more  or  less  dynamical  equivalent,  the  “Coriolis 
force”  this  apparently  additional  generated  voltage  may  be  called  the 
“C’oriolis  voltage.”  Also  the  flux  density  due  to  the  currents  in  the 
moving  coordinate  axes  may  be  calle<l  the  “Coriolis  flux  density.” 

d.)  From  the  Equation  of  Voltage  (equ.  17)  of  the  machine  with 
moving  coordinate  axes  the  induced  voltage  is  Lmkdi’^  'dt  the  generated 
voltage  is  r«i.  Each  of  these  can  be  divided  by  their  trans¬ 

formation  formulae  into  two  components. 

1.)  By  equ.  44  the  induced  voltage  is 


r  di'  Qw  ,  w  fr* 

U.-C*  — ... 


.60 


The  first  term  on  the  right  hand  side  is  the  induced  voltage  measured 
by  the  stationary  observers,  but  expressed  along  the  moving  coordinates 
by  Cl.  The  second  term  is  the  decrease  of  the  induced  voltage. 

2.)  By  equ.  52  the  total  generated  voltage  is 

r»,. -i-  l„  ^  c;.-.' . 6i 


NON-RIEMANNIAN  DYNAMICS 


145 


The  first  term  on  the  right  hand  side  is  the  generated  voltage  vector 
measured  by  stationary  observ-ers  but  expressed  along  the  moving 
coordinates  by  Cl.  The  second  term  is  the  Coriolis  voltage,  the 
increase  in  the  generated  voltage.  , 
e.)  The  decrease  in  the  induced  voltage  is  equal  to  the  increase  in 
the  generated  voltage,  that  is 


L„cic:  ^  +  L„Cl  ^  r.' 

r  .  dc:^\  ,  d(c:c:)  ^ 

=  /...C.,  ,y—c.  +  —c:j~  L..C,t  ■  - 0 


since  C7C'  is  the  idemfactor  and  its  derivative  is  zen).  Hence  fixtm 
equ.  56 


Coriolis  voltage  =  ?*  =  L„Cl  i"i‘  »  ♦*,1* . 62 

OT 


f.)  The  sum  of  equ.  60  and  equ.  61  gives 

+  r:,.*rt'  -  -h  ci 


.63 


which  is  the  reverse  transformation  of  equ.  45. 


CONSTANT  SPEED 

XVIII.  The  Transient  Impedance  Matrix 

a.)  In  by  far  the  largest  number  of  problems  the  speed  of  the  rotor 
is  assumed  to  be  known.  In  that  case  the  Equation  of  Voltage  is 
simplified  to 

di‘ 

%  ”  Hm$i^  -}-  LmS 

*  (Rmfi  +  LaSP  +  Vai,ai‘)  1® 

where  p  »  d/dt.  The  expression  in  parenthesis  is  an  important  dyadic, 
is  denoted  by  and  called  the  “transient  impedance  matrix”  repre¬ 
senting  the  opposition  of  a  machine  to  suddenly  applied  terminal 
voltages  while  its  speed  is  maintained  constant,  that  is 

Z»S  »  KmS  +  I>a^  d- 


64 


F 


146  .  GABRIEL  KRON 

Hence  for  steady  speed  the  Equation  of  Voltage  is 


65 


from  which  the  current  is  found  by  calculating  the  inverse  of 

. 66 


Y"*  may  be  called  the  “admittance  matrix.”  (See  section  XXXId.) 

b.)  For  the  representative  machine  with  stationary  coordinate  axes 
the  impedance  matrix  is 


d. 


dr 


7r 


<7. 


Mjp 

-  Mipe 

0 

M,p 

Tr  l-riP 

-  L'rdVO 

u 

0 

Lr^pe 

r,  -1-  L„p 

M.P 

0 

M^p8 

M,p 

r.,  +  L,^P 

..67 


It  should  be  noted  that  Z.5  is  not  asymmetrical  matrix.  For  the  repre¬ 
sentative  machine  with  two  layers  of  stator  and  rotor  windings  to  be 
used  in  general  work,  the  transient  impedance  matrix  is  given  in  Table  I. 

C.)  For  any  machine  whose  connection  tensor  is  given  by  Cl  its  transient, 
impedance  matrix  is  found  from  thatrf  the  above  representative  machine  by 


Z., 


.68 


as  follows  from  equs.  64  and  46.  If  no  slip-rings  exist  the  last  term 
disappears  and  Z.^  is^a  tensor.  The  torque  is  by  equ.  19 


Torque  « 


.69 


where  i  is  found  from  P,*.  1  given  in  Table  I  by  the  transformation 
formula  (equ.  55) 


p.#.i  -  Pr...c:c;. 


.70 


When  the  impresseil  voltage  is  a-c  then  p  »  d/dt  is  replaced  by  ju, 
pL  becomes  jX  and  the  impedance  matrix  may  be  called  the  “steady- 
state  impedance  matrix.”  With  an  applied  d-c  e.mj.  p  *  0.  Also 
p$  becomes  vu  where  v  is  the  ratio  of  the  actual  speed  to  the  syn¬ 
chronous  speed. 


NON-RIEMANNIAN  DYNAMICS 


147 


d. )  When  any  number  and  any  type*  of  rotating  machinee  are  con¬ 
nected  in  any  manner  whatever  the  transient  impedance  of  the  group  is 
equal  to  the  sum  of  the  transient  impedances  of  the  individual  units, 
transformed  by  the  connection  tensor  of  the  group  C“.  That  is 

71 

e. )  Even  when  only  the  steady-state  performance  is  desired  it  is 
always  a  safe  procedure  to  set  up  first  the  transient  impedance  matrix. 

The  transient  impedance  matrix  gives  the  instantaneous  voltage 
appearing  at  each  terminal  due  to  a  suddenly  applied  unit  current  at  any 
one  of  the  terminals  {with  the  speed  maintained  constant),  while  the  transient 
admittance  matrix  gives  the  instantaneous  current  flowing  through  each 
terminal  due  to  a  suddenly  applied  unit  voltage  at  any  one  of  the  terminals. 


XIX.  The  Repulsion  Motor 

a.)  As  an  example  let  the  performance  of  the  repulsion  motor  be 
calculated  whose  connection  diagram  and  connection  tensor  are  given 
in  fig.  6.  Its  transient  impedance  tensor  is 

d,  a 


^  d»  rw  +  L^p 

Md  (cos  ap  —  sin  ap6) 

a  M  4  cos  op 

r,  -h  {Lrd  Cf)s*  a  +  Lr,  sin*  a)p  —  {Lm  —  L,,)  sin  o  cos  apd 

If  the  airgap  is  uniform,  L,^  *=  Lrs  *  Lr,  the  impedance  tensor  simpli¬ 
fies  to 

d,  a 

z 

rmt  -h  L^p  Md  (cos  ap  —  sin  ap6) 

a 

Md  cos  ap  r,  4-  LrP 

If  the  determinant  is  denoted  by  D,  the  transient  admittance  tensor  is 
d,  a 


(r,  4-  Lrp)  /  D 

—  .Md  (cos  ap  —  sin  ap8)ID 

—  Md  008  ap,  D 

(r«i  L^p)/D 

b.)  With  a  suddenly  applied  terminal  voltage  e*  *  cl  where  1  is 
the  Heaviside  unit  function,  *  c*  and  i'  *  c*  or  if  p8 

»■  t’W 

. _ Crf.  (r,  -h  Lrp)l _ 

r,r^  -f  p{r,L^  +  r^Lr  +  A/d*  wjs  a  sin  ovu)  +  p^{L^Lr  —  MJ  cos*  vu 


148 


GABRIEL  KRON 


c.)  With  an  a-c  terminal  voltage  e*  »  e  sin  wt  *  e  the  «teady-9tate 
admittance  tensor  is 


Z“> 

where 


d. 

a 

(rr  +  jXr)/D 

( —  jXm  008  a  -b  Xm  sin  av)  /D 

—  jXm  cos  a  ff) 

(r^  +>Xw)/I> 

I)  *  (r,  +  +  A'«*  cos*  at'  -  X^W)  +  i(r,A'«i  -f  r^Xr 

+  Xm*  sin  a  cos  at') 

hence 

—  i(r,  +  jXr)  '  D  and  i* 
d.)  The  value  of  r.a. ,  is 

d,  a 

d. 

r«fl.  I  * 


—ijXm  (cos  a  -H  jv  sin  a)/D. 


0 

Md  sin  a/2 

Ma  sin  a/2 

(Ltd  —  sin  a  008  a/2 

For  a  machine  with  smooth  airgap  »  =  F^,..  <  +  F.^.,  i.  The 
torque  is  2F*  Since  the  steady-state  torque  is  to  be  expressed 

in  synchronous  watts  the  expression  should  be  multiplied  by  w.  Hence 

Torque  »  (Xm  sin  a)  •  \i(rr  ■hjX)/D]  •  (^X*(r  sin  a  —  j  cm  a)/D]. 

h^h  expression  is  a  time  vector  due  to  the  presence  of  j,  hence  the 
pnxlucts  represent  scalar  products.  Since  the  scalar  product  (a  +jb) . 
(r  +  jd)  ^  ac  +  bd 

'  l*Xm  sin  a  (r,  sin  a  —  Xr  cos  a) 


Torque 


(r,  +  rw  +  Xi  cos* or  —  XwX,)*  +  (r,Xw  +  rwX,  -f  X*  sin  a  cos  art').* 


XX.  Two  Salient-pole  Synchronous  Machines 

a.)  The  connection  diagram  and  connection  tensor  of  the  group  is 
given  in  fig.  17.  The  transient  impedance  tensor  of  the  group  is  found 
from  equ.  71.  The  constants  of  the  second  machine  are  primed. 
Also  6/f  —  41  —  8.  Hence  Z^a  is  (since  i*  —  —  i*'). 


ft'.' 


150  GABRIEL  KRON 

b.)  Since  the  applied  steady-state  voltaKes  along  the  direct  and  quad¬ 
rature  axes  of  the  armature  are  constants  »  e  sin  5  and  e*'  «  e  cos  6, 
the  steady-state  impedance  tensor  of  the  group  is  found  by  making  p  *■  0. 
Hence  (since  ptf  *  Pw  ■«  w  and  »  X) 


d: 

d, 

dr 

Z., - 

Qr 

Q. 

If  the  resistances  are  assumed  to  be  zero  the  equations  check  with 
those  given  by  Doherty  and  Nickle. 

XXI,  Scattered  Examples 

1.)  Fynn-Weichsel  Motor.  Fig.  14.  Its  transient  impedance  ten¬ 
sor  is 


d.  d,  dr  Qr  q, 


r'.s 

0 

-  sin  6 

X' j  cos  a 

0 

0 

0 

rw 

0 

-  X.., 

0 

0 

0 

0 

r,  -f-  —  sin  6. 

a)s6(X;,  -  X'r,) 

—  Xrs  —  X’r4  COS*  a 

—  X ' ,  sin*  a 

0 

0 

0 

0 

X rt  XJ^sin*5  -)- 
X',  cos*  6 

r,  -1-  r'  -f  sin  a. 
cosa(x;,  -  x:,) 

0 

0 

0 

0 

X., 

0 

’'.f 

0 

0 

0 

-  x«,  cos  a 

-  X' ,  sin  a 

0 

f 

r.„ 

152 


GABRIEL  KRON 


For  the  calculation  of  torque 

_ 


—  Mn  sin  a 

0  i 

-  Mn 

Mn  cos  a 

Mn 

0 

a. )  For  its  steady-state  performance  as  an  induction  motor  p 
*  jw{l  —  p)  and  p6  =  vu. 

b. )  For  its  performance  as  a  gynchronou*  motor  p  *  0  and  p6  *  w, 

2.)  Schrage  Motor.  Fig.  8.  Its  connection  tensor  as  a  polyphase 

motor  is 


d'r 

dl 

fll 

<7. 

n 

**»  -  ,*• 

0 

0 

-  it  (€*«  -  «*•) 

-  itn 

0 

0 

—  jt’* 

0 

0 

If  -  a)/2  -  y  and  (o  -|-  0)/2  «=  6 


/  a 


(r,  -b  jgX,)  n*  -b  2X^  sin  y  —  gf^)  n 

-b  4  (rj  -b  jX‘r)  sin*  y 

jX^n  -  2A'  J  sin  yt^ 

jgX^n  -b  2X 1  sin  •yc"'* 

r'r  +  jXl 

3.)  Polyphase  Induction  Motor  with  slip-rings.  Fig.  3.  Its  tran¬ 
sient  impedance  tensor  is 


d,  a 


r,  +  L.P 

Mp  (cos  6  —  j  sin  0) 

Mp  (cos  -b  Ar  sin  9) 

Tr  +  UV 

4.).  Frequency-Coqverter.  Fig.  12.  With  fundamental-frequency 
voltage  on  the  slip-rings  and  slip-frequency  voltage  on  the  brushes 
its  transient  impedance  tensor  is 


a  d. 


r,  +  l-rV 

(r,  -b  L,p)  (cos  9  —  j  sin  9) 

(r,  -b  L,p)  (cos  -b  A:  sin  9) 

r,  -b  LrP  ~  j  I'rP^ 

5.)  Induction  Motor  and  Frequency-Converter  connected  to  be  used 
as  Phase-Advancer.  Fig.  15.  Its  transient  impedance  tensor  is  found 
simply  by  adding  the  two  previous  impedance  tensors  and  multiply¬ 
ing  their  sum  with  the  connection  tensor  of  the  gnmp  according  to 
equ.  71.  The  steady-state  impedance  tensor  of  the  group  is 


NON-RIEMANNIAN  DYNAMICS 


153 


d,  f  a 


r, 

0  1 

A- 

( r,  +  jsXf)  =1-  (r^  +  jX ' ) 

(r:  -VjX'r)  €>•  1 

0 

(r'  -f  jA';)  t" 

r'  +  jX' 

NON-HOLONOMIC  DYNAMICAL  8Y8TEMM 

XXII.  The  Equation  of  Motion  of  Holonomic  Systems 

a.)  Comparing  the  definition  of  ,  as  given  in  equations  12,  13 
and  14  for  the  representative  machine  with  mooing  coordinate  axes  and 
its  value  given  in  equ.  51  and  56  for  the  representative  machine  with 
stationary  coordinate  axes,  it  can  be  seen  that  r,<,  ,  is  not  equal  to 
dL„!dx*  neither  is  F.,, ,  equal  to  —  since  every  coefficient 

of  L„  is  a  constant  (see  equ.  39).  That  is,  the  following  important 
fact  should  be  noted: 

The  generalized  Christoffel  symbol  is  defined  in  terms  of  the  metric 
tensor  L^s  by  equ.  11  only  for  the  representative  machine  with  moving 
coordinate  axes.  For  every  other  coordinate  system  its  value  is  found  by  a 
transformation  of  coordinates  from  equ.  J^6  since  as  yet  F.^,  y  has  not 
been  defined  for  them  in  terms  of  the  metric  tensor. 

This  is  the  reason  why  the  qualifying  term  “generalized”  is  added 
to  differentiate  it  from  the  ordinary  Christoffel  symbol  of  Riemannian 
Geometry  which  is  defined  by  an  equation  identical  with  equ.  8  for 
every  coordinate  system  and  which  of  course  also  obeys  the  rule  of 
transformation  of  equ.  46. 

This  distinction  in  the  definition  of  F.^,  y  in  terms  of  for  the  two 
coordinate  systems  is  due  to  the  fact  that  among  the  innumerable  types 
of  rotating  electrical  machines  there  is  one  and  one  only  whose  coordinates 
are  true  Lagrangian  coordinates.  This  one  is  the  representative  machine 
with  moving  coordinate  axis  having  any  number  of  layers  of  windings 
on  the  stator  and  rotor,  since  in  order  that  an  axis  should  be  a  true 
Lagrangian  coordinate  axis  it  must  be  connected  to  the  moving  conductors. 
To  this  type  belong  only  the  synchronous  machine  and  the  slipring 
induction  motor  provided  the  sliprings  are  taken  as  the  coordinate  axes. 
It  is  only  for  these  axes  that  .Maxwell's  form  of  the  Equation  of  Voltage 
(derived  from  equs.  5  or  17) 


154 


GABRIEL  KRON 


is  valid.  This  equation  ia  not  valid  for  any  other  axes  since  it  gives 
only  the  induced  and  Coriolis  voltages,  but  it  does  not  give  the  generated 
voltage  due  to  the  motion  of  rotor  conductors.  Any  attempt  of  routine 
eubatitiUion  fails,  which  immediately  can  be  seen  in  case  of  machines 
with  stationary  coordinate  axes  where  dL^jdl  »  i'dLmefdx*  »  0  as 
mentioned  at  the  beginning  of  this  section. 

This  fact  has  not  been  recognized  before  and  this  is  the  reason  why 
numerous  treatises  and  papers  on  rotating  machinery  start  with 
Hamilton’s,  Lagrange’s  or  Maxwell’s  equation  of  the  usual  form,  but 
use  them  only  for  the  alternator 'with  moving  cof>rdinate  axes  and  for 
no  other  machine.  They  all  are  compelled  to  analyze  the  complicated 
physical  phenomena  inside  of  each  type  of  machine  and  even  in  the 
alternator  if  its  moving  axes  are  changed  to  stationary  axes  along  the 
heldpole  and  the  interpolar  space,  since  the  routine  eubstitution,  which 
after  all  i«  the  whole  purpose  of  all  generalized  equations,  fails. 

b.)  In  the  following  sections  r«4.  ,  will  be  defined  in  terms  of  the 
metric  tensor  for  all  C(K)rdinate  systems.  The  three  matrices  of  Ti®.  , 
r«(.  y  and  r.4.  I  will  be  found  to  be  defined  slightly  differently  from  the 
matrices  defined  previously,  but  again  the  voltages  and  torques  calcu¬ 
lated  from  them  are  identical  with  those  found  by  previous  defini¬ 
tions  of  r«0.  y. 

Also  in  the  next  three  sections  all  rotating  machines  will  be  con¬ 
sidered  derived  from  the  representative  machine  with  moving  coordinate 
axes,  that  is  their  connection  tensor  CZ  will  be  the  product  of  the  con¬ 
nection  tensor  Cl  given  in  figs.  2-17  with  that  of  the  representative 
machine  with  stationary  coordinate  axes  CZ.  In  other  words  C"  will 
be  equal  to  C'C7. 


XXIII.  The  Equation  of  Motion  of  Son-holonomic  Systems 

a.)  Let  the  Equation  of  Motion  for  true  coordinates  be  used  in  the 
form  of  equ.  6  that  is,  let 


I  ( f  in 

“2  IFr 


.73 


Instead  of  defining  the  term  in  parenthesis  as  r«».t  and  calculating 
its  transformation  formula  (as  has  been  done  in  section  XV)  let  the 
whole  equation  be  transformed  term  by  term  to  a  new  non-holonomic 
S)r8tem  with  axes  €,  t,  a  . . .  by  replacing  o«*  by  o,,C*rj,  i"  by  IfCZ 
etc,  where  C*  is  not  equal  to  dx^/dx" 


NON.RIEMANNIAN  DYNAMICS 


155 


f.c: .  +  a^czci^c:  + 

+  ^c:c:i-c:i’c;  +  +  o„  o-crc; 


-  ~  c:cii«c:i>c;  - 


1  aC* 

-^^a^^czxKr.xyc; . 74 


But 


and 


o„c:x»x>(^-c;  +  ^c:)  -  .  o 

- 1  o^-c-i’c;  c;  +  ^*  c;)  .  -  o^-o>c;  ^  c; 


since  a  and  S  also  m  and  n  can  be  interchanged  in  this  expression. 
Multiplying  every  term  by  C*  the  result  is 


/ aa.,  _  1  da,y'^ 

\  ax>  2  ax* ; 

+  a.*c;ct 

1 

K\h 
- ' 

Thh  is  the  Equation  of  Motion  for  a  non-holonomic  dynamical  system 
and  this  is  also  the  form  of  equation  that  should  be  used  for  the  study 
of  various  types  of  rotating  machines.  •  The  correctness  of  the  formula 
can  be  checked  by  replacing  a,^  by  dT/dx*  where  T  is  the  kinetic 
energy  of  the  system  in  terms  of  new  differentials,  that  is  T  —  (§)a,^i*. 
Hence 


d  (dT\  _dT  dT/dCi  _  dct\  y 

dt  Vdr'/  di*  ai*  \  di"  dx*  /  ^  ’ 


aP 

ax* 


/. 


....76 


This  is  the  modified  Equation  of  Motion  of  Lagrange  valid  tor  non- 
holonomic  dynamical  systems  as  given  by  Whittaker  in  his  ‘‘Analytical 
D3rnamic8,”  with  the  notation  slightly  changed  and  dP/dif  added. 

It  should  be  noted  that  in  equ.  75  all  a.^  and  C«  are  functions  of  the 
old  coordinates  and  in  the  n  dynamical  equations  there  are  2n  variables. 
They  are  the  n  old  c(M)rdinate8  and  the  n  new  differentials  or  velocities. 
The  other  n  equations  are  dx*  »  Cldx^. 


.75 


156 


GAHRIEL  KRON 


b.)  If  inHtead  of  tranofomung  equ.  73  term  by  term  the  expression 
in  parenthesis  is  defined  as  Fm,.*  and  transfomfed  just  as  in  section  XV 
the  result  is 


A  -  R„i-  -h  o„  +  (r,..*  C:c:Ct  -h  o.*  ^  Ct)  jfi'. 

Even  th(»ugh  in  the  general  case  of  non-holonomic  systems  the  expres¬ 
sion  in  parenthesis  is  not  a  function  of  the  n  new  c(K)rdinates  but  of 
the  n  old  C(M>rdinates  and  the  n  new  differentials,  still  a  geometry  can  be 
defined,  the  so  railed  “non-holonomic"  geometry  in  which  the  expretution 
in  the  parentheeis  playn  the  r6le  of  the  “coeffirientx  of  connection." 
Schouten  defines 


so  that  the  Equation  of  Motion  of  non-holonomic  dynamical  sy ulema 
becomes 

78 


^  -I-  A„.  rX'i*  -I-  R„x‘  *  /, 
at 


XXIV,  The  Equation  of  Motion  of  Quaai-holonomic  Syatema 


a.)  Due  to  the  special  character  of  the  coordinates  of  electrical  machinery, 
equ.  75  and  78  assume  special  forms.  All  L„  and  Cj  are  functions  of 
one  variable  x*  which  is  both  an  old  and  a  new  C(K)rdinate  and  so  in  the 
n  dynamical  equations  there  are  only  n  unknowns,  the  n  new  coordinates. 
Hence  equ.  75  and  78  must  be  identical  with  the  Equation  of  Motion 
as  derived  in  equ.  ip  for  any  c(M)rdinate  system,  that  is  it  must  be 
identical  with 


e,  «  R„i* 


rff* 

-I- 


.79 


Hence  the  generalized  Christ offel  symbol,  must  be  equal  in  any  coordi¬ 
nate  system  to 


f  dLm*  1  bLay\  f  sn 


This  is  the  equation  in  which  expressed  in  terms  of  the  metric 

tensor  of  the  carious  machines  and  not  equ.  11.  This  equation  is  a 


NON-RIEMANNIAN  DYNAMICS 


157 


generalized  form  of  cqu.  11  and  is  valid  for  any  coordinate  system. 
In  the  special  case  of  true  coordinates  the  coordinates  k  and  n  can  be 
interchanged,  the  last  term  is  zero  and  equ.  11  is  left. 

b.)  The  first  expression  in  parenthesis  on  the  right  hand  side  is  the 
ordinary  Christoffel  symbol.  Its  formula  of  transformation  is  the  same 
as  that  of  that  is  equ.  46. 

Latdi^/dt  and  the  ('hristoffel  s3rmbol  together  form  a  tensor  (see 
section  XV)  and  since  the  first  three  members  of  the  right  hand  term 
of  equ.  75  are  tensors,  the  last  member  also  must  be  a  tensor.  That  is, 
the  coefficient  of  i“i‘»  is  a  tensor  of  rank  three  (the  “torsion”  tensor). 


81 


It  is  important  to  note  that  T,,,  is  skeiv-symmetrir  in  y  and  a  that  is 


-  n.. 


.82 


Hence  the  generalized  Christoffel  symbol  can  be  expressed  as  the  sum 
of  the  ordinary  Christoffel  symbol  and  a  tensor  of  rank  three, 


»  l7a,gl  Ty, 


.83 


c.)  The  h^quation  of  Motion  of  mtating  electrical  machinery  can  be 
written  in  terms  of  the  new  symbols  as 


f,  =*  K.,i"  +  +  lya.ffli’i*  +  Ty„i^i" 


84 


PHYSICAL  INTERPRETATIONS 


A’A’V'.  The  Equation  of  Voltage 


Si.)  The  hk]uation  of  V'oltage  can  be  set  up  by  making  a  assume  all 
imlices  except  t.  Because  in  C*  the  index  k  also  must  assume  any  value 
but  t  the  negative  terms  in  equ.  75  disappear  and  the  Equation  of 
Voltage  is 


-  «...■•  +  i,„  f  +  (^  +  .V . 85 

b.)  The  value  of  Ty„  is  if  7  —  / 

a/’* 


86 


158 


GABRIEL  KRON 


11 

1^51 


and  if  a  >  I  the  value  of  Ty„  is 


These  equations  are  valid  for  any  machine  with  stationary  or  moving 
coordinate  axes  since  Tmy  is  a  tensor. 

c.)  The  value  of  the  Chrietoffel  symbol  ie  zero  for  machines  with  sta¬ 
tionary  coordinate  axes  since  all  coefficients  of  L^s  are  constants.  Its 
value  for  machines  with  moving  (KK)rdinate  axes  is: 

1.)  If  7  »  I  then  from  equ.  85 

-  % . 


2.)  If  a  ^  t  then  from  equ.  85 

\yt,  o]  »  0 . 

Hence  in  terms  of  this  symbol  the  Equation  of  Voltage  is 


e,  »  /?■#*•  +  L„  -f  Tim,i*i*. 

Comparing  it  with  equ.  10 

Ti., ,  -  [ta,  ff)  -I-  T,„ . 


d.)  As  an  example  let  the  value  of  be  calculated  for  the  repre¬ 
sentative  machine  with  stationary  coordinate  axes  from  equ.  91.  Since 
the  coordinate  axes  are  stationary  (la,  is  sero.  Tim  is  found  from 
equ.  86 

1. )  L«i  is  given  in  equ.  39. 

2. )  Ci  is  given  in  equ.  28. 

3. ) 


dr  * 

act  a 

—  sin  $ 

COS  8 

dxf  h 

—  008  6 

—  sin  0 

in  equ.  33. 

dr 

*  Qr 

o 

1 

Qr  -  1 

0 

NON-RIEMANNIAN  DYNAMICS 


159 


6.)  Multiplying  it  by  equ.  39  (that  is  changing  in  equ.  39  the  column 
dr  to  — the  column  qr  to  d,  and  leaving  out  all  other  columns)  the 
result  is  identical  unth  equ.  SI,  That  is  with  the  new  dehnition  Fta.  , 
has  the  same  matrix  that  previously  Pd.), ,  had. 

XXVI.  Summary  of  Voltage  Vectors 

Summarizing  the  voltages  due  to  the  presence  of  flux  lines  (defining 
r»«. «  by  equ.  91) 

1. )  L„di"/dt  ~  dtpridt  »  ...  induced  voltage  due  to  the 

variation  of  currents. 

2. )  lla,  o]  t*t'  »  ^fi‘  *■  . . .  C'oriolis  voltage  due  to  the 

motion  of  coordinate  axes. 

3. )  T,„i*i‘  «  4>t*i*  *  . . .  generated  voltage  due  to  the 

motion  of  rotor  conductors. 

4. )  Pi.. “  ...  total  generated  voltage.  _  92 
Summarizing  the  various  flux-density  vectors: 

1. )  “  P.t.  “  Tmyii''  «...  rotor  flux  density. 

2. )  ^1.  •  *  [ly,  a]  i'*'  .  Coriolis  flux  density. 

3. )  ^’1,  ■  *  +  ♦i, «  *  P«7.  mi'  *  . . .  sum  of  the  rotor  and 

Coriolis  flux  densities.  ,  ,93 

Summarizing  the  induced  voltages  (if  »  L^gi^  is  the  flux-linkage 


vector) 

1. )  L,gdi*;dt  »  dip„/dt  ^  .  induced  voltage 

2. )  (dLmg/dt)i^  — . Coriolis  voltage 

I  3.)  d{Lmgi^),'dt  “  dpm'dt  »■  ....  sum  of  the  induced 
I  and  C’oriolis  volt«^  ^  ^  _  94 


The  Coriolis  voltage  may  be  considered  either  as  an  induced  or  as  a 
generated  voltage. 

Hence  the  Equation  of  Voltage  can  be  written  in  two  different  forms 
io  terms  of  fluxes 


160 


GABRIEL  KRON 


..XlU0  latter  equation  reduces  to  the  form  given  by  M&xwell  equ.  72 
adien  the  rotor  coordinate  axee  are  connected  to  the  moving  conductors 
since  the  last  term  drops  out. 

It  should  be  especially  noted  that  i*dLmf/dl  can  not  be  interpretetl 
as  the  voltage  due  to  the  motion  of  the  conductors.  It  is  the  voltage 
due  to  the  motion  of  the  (XMirdinate  axes,  the  C'oriolis  voltage. 

'  *  XXVII.  The  Equatum  of  Torque 

The  Fxiuation  of  Torque  can  be  set  up  by  making  in  equ.  75  a 
equal  to  t. 

The  value  of  —  \dL^y  'dr*  «  (>o,  tj  is  zero  in  machines  with  stationary’ 
axes.  In  machines  with  moving  axes  its  value  must  be  equal  to 
LmtC'tdCy  /dx“  from  the  transformation  formula  of  ,  (pqu.  46) 
changing  from  stationary  to  moving  axes,  which  however  is  zero. 
Hence  in  any  c<N>rdinate  system 

=  0 . 97 

(The  Coriolis  voltage  [ta,a\  or  [7/,  o]  never  reduces  to  zero  in  a  machine 
with  moving  axes  for  any  definition  of  [ya,  a\  by  equs.  8  or  11  or  80  if 
equ.  46  is  used  since  the  denominator  on  the  right  hand  side  is  x*  at 


least  once,  unlike  fur  the  torque.) 

The  value  of  T,.,  is 

T,- -  -  t..f ;  . ;....98 

Hence  the  Equatum  of  Torque  is 

>e,  =  R„i'  +  +  Ty„i''f . 99 

at 

That  is 

I\..i  =  Ty„ . . 100 


Since  T^yti"^  represents  the  resultant  rotor  flux-density  vector 
equ.  93  the  torque  of  all  machinee  is  due  to  the  itUeraction  of  the  rotor 
currents  and  the  rotor  flux-density  wave,  that  is 


NON-RIEMANNIAN  DYNAMICS  161 

XXVIII.  The  Equation  of  Power 

a.)  If  the  tihaft  c<H>rriinate  is  excluded,  the  Equation  of  Power  is 
found  from  the  F^quation  of  Voltage,  equ.  90  by  multiplying  through  with 
the  current  i* 

r.i*  “  +  Lmt  ^  »'  +  l<a,  o]  +  T,„iU"P . 102 


The  time  rate  of  change  of  the  stored  magnetic  energy  for  machines 
with  moving  coordinate  axes  is  (if  a  and  0  are  stationary,  m  and  n 
moving  axes) 


(iT  _  1  d(L.flr:r;r"r) 

(it  “  2  dt 


The  last  term  is  the  power  due  to  the  Coriolis  voltage,  the  previous 
term  is  zero  and 


. 

...104 

(iT 

e,p  =  +  Mechanical  Output . 

dt 

...105 

where 

Mechanical  Output  =  T,a,i‘i"i* . 

...106 

Since  the  output  is  also  equal  torque  times  velocity  it  follows  that  the 
('oriolis  flux-density  does  not  contribute  to  the  torque  and  the  Coriolis 
voltage  does  not  contribute  to  the  mechanical  output  only  to  the 
variation  of  the  stored  magnetic  energy. 

b.)  If  the  shaft  coordinate  is  not  excluded  the  Equation  of  Power  is 
found  by  multiplying  through  equ.  75  by  i'.  Since  —^dL^y/dx*  is 
always  zero 

e,P  -  +  L..  P  +  t-i'C . 107 

dt  dx^ 

The  other  terms  cancel.  Since  from  equ.  104  the  last  two  terms  are 
equal  to  dTIdt,  hence 


108 


162  GABRIEL  KRON 

XXIX.  The  Various  Definitiom  of  y 

a. )  It  should  be  noted  th^  there  are  several  definitions  of  Vms.y  each 

definition  giving  different  matrices  for  P.i^ ,  and  P.^, ,  but  stiU  the 

same  final  voltages  and  torques. 

In  defin#)g  P>tf.  y  from  the  standard  Lagrangian  equation  (sections  V 
and  V'l)  there  were  four  different  definitions  of  P^,y  according  to  the 
definition  of  the  ('hristoffel  symbol  with  two  (equ.  11)  or  with  three 
terms  (equ.  8)  and  according  to  the  order  of  the  indices  in  defining 
them  as  P«,, »  or  P«,,  *. 

In  defining  from  the  generalized  equation  of  Ijogrange  (section 
XXIV)  the  same  arbitrary  definitions  can  be  selected  but  even  so  the 
three  matrices  of  P.^.^  Are  different  from  those  above. 

1. )  The  rotor  generated  voltage  may  be  defined  as  Pi^,  yi*  (equ.  91)  or 
P«»,  y  (equ.  51)  with  matrix  51  or  as  the  sum  of  two  matrices  57  or  58  when 
defined  as  P(i«), ,  in  section  VId. 

2. )  The  torque  may  be  defined  with  matrix  51  as  given  in  equ.  98  by 
Tyat  or  with  its  symmetrical  part,  matrix  56,  since  the  torque  is  a  homo¬ 
geneous  quadratic  form  and  the  skew-symmetrical  matrix  gives  zero 
torque 

3. )  The  Coriolis  voUage  may  be  defined  as  [f/Sj-y]  or  (a/,7)  or  [(/^),7l 
by  equ.  8  or  11. 

b. )  Also  it  should  be  noted  that  the  transformation  formula  46  could 
have  been  defined  also  as 

+  . io» 

if  equ.  11  had  been  defined  as  P.»,t  instead  of  P«.,i.  The  definition  46 
was  assumed  to  conform  to  the  order  of  the  indices  as  given  by  Schouten. 

C.)  For  purposes  ttf  quick  calculations  P.4_,  for  the  representative  ma¬ 
chine  with  stationary  coordinate  axes  should  be  defined  as  given  by  equ.  75 
since  that  equation  defines  both  P,.  ,  and  with  the  same  matrix  51  and 
saves  the  calcidation  of  different  formulae  for  the  torque  and  voltage.  For 
any  other  machine  P.^,,  is  found  by  a  transformation  of  coordinates  icith 
the  aid  of  equ.  109. 

XXX.  Physical  Definitions  of  the  Dynamiccd  Equations 

a.)  For  the  representative  machine  with  moving  c<M>rdinate  axes 
the  Equation  of  Motion  of  Lagrange 


NON-RIEMANNIAN  DYNAMICS 


163 


can  be  divided  into  the  Equation  of  Voltaf^ 


where  the  first  term  of  the  left-hand  member  gives  all  the  voltages  due 
to  the  presence  of  flux  lines.  The  Equation  of  Torque  is 


where  the  first  term  represents  the  time  rate  of  change  of  angular  mo¬ 
mentum  and  the  second  term  the  machine  torque. 

b.)  The  physical  interpretation  of  the  generalized  Equation  of  Motion 
valid  for  all  electrical  machinery 


.  ST/dCj 


/. . 113 


is  different  from  that  of  the  standard  form.  The  Equation  of  Voltage  is 


-  ( .  dT  dCj 

dt  \axV  •  dx" 


ctc;x> 


-I- 


dx* 


e, . 114 


where  the  first  term  on  the  left-hand  member  represents ^w  only  the 
induced  and  Coriolis  voltages  while  the  second  member  Tepresents  the 
rotor  generated  voltages.  The  extra  term  appears  however  only  in 
the  equations  of  the  axes  stationary  on  the  rotor.  For  all  stator  axes 
and  for  all  axes  moving  with  the  rotor  the  generalized  equation  reduces 
to  the  standard  Lagrangian  form. 

The  Equation  of  Torque  is 


d 

dt 


dT  dT  act  .  dF 


/. . 115 


The  second  term  on  the  left-hand  side  is  identically  zero  (equ.  97)  and 
the  machine  torque  is  given  now  by  the  third  term.  It  is  interesting 
to  note  that  although  axis  I  remaines  unchanged  still  the  original  equation 
is  not  valid  for  it  only  the  generalized  one. 

Equ.  107  shows  that  the  power  due  to  the  additional  terms  in  the 
generalized  equation  is  zero.  Hence  the  additional  terms  may  be  called 
“gyroscopic  forces.” 

c.)  These  interpretations  are  in  conformity  with  the  rule  given  by 


164 


GABRIEL  KRON 


Appell  for  non-holoDomic  dynamical  Rystems  of  various  orders.  These 
rules  are  the  following: 

1. )  For  the  coordinate  axes  that  are  changed  the  generalized  equa¬ 
tions  apply. 

2. )  For  the  C(K>rdinate  axes  that  are  not  changed  but  occur  in  the 
transformation  tensor  also  the  generalized  equation  applies. 

3. )  For  the  ccx>rdinate  axes  that  are  not  changed  and  that  do  not 
occur  in  the  transformation  tensor  the  generalized  equations  reduce  to 
the  original  Lagrangian  equation. 

THE  ABSOLUTE  CALCULUS 

XXXI.  The  Generalized  “Per-unit”  or  Contravariant  Quantities 

a. )  In  connection  with  the  analysis  of  synchnmous  machinery  a 

simplification  is  generally  used  to  avoid  the  use  of  conversion  factors. 
It  is  assumed  that  the  stator  current  is  numerically  equal  to  its  flux- 
linkages  in  the  rotor,  that  is  the  rotor  flux-linkage  i^'Mi  is  replaced  in 
analytical  work  by  i*‘  and  by  »**.  As  a  consequence  of  this 

assumption  the  currents  and  voltages  are  replaced  by  other  quantities, 
the  so-called  “per-unit”  quantities.  For  instance  the  per-unit  stator 
terminal  voltage  is  e,iMilr^  instead  of  e^,  etc. 

The  disadvantage  of  this  limited  assumption  is  that: 

1. )  the  symmetry  of  all  scalar  equations  for  the  stator  and  rotor  is 
destroyed. 

2. )  it  cannot  be  used  for  other  machines. 

b. )  In  the  Absolute  Calrulut  there  is  a  fundamental  process  of  simpli¬ 
fication  called  the  “raising  (or  lowering)  of  indices”  which  in  case  of 
rotating  machinery  is  equivalent  to  a  generalization  of  the  per-unit  concept. 

The  resultant  flux-linkages  of  a  winding  a  due  to  all  currents  is 
represented  by  L,$i*.  In  order  to  represent  the  resultant  flux-linkages 
of  each  winding  by  the  current  itself  flowing  in  that  winding  every  operator 
of  the  machine  is  multiplied  by  the  inverse  of  the  metric  tensor  L'^g  that  is 
by  If  the  Ek^uation  of  Motion  is  multiplied  through  by  L**,  then 

L-*e.  -  L>^R.tio  +  ^  +  . 116 

1. )  ejj"*  is  denoted  by  e*  and  is  called  the  “contravariant  voltage,” 
It  corresponds  to  the  expression  “per-unit  voltage.” 

2. )  L"*Rma  is  denoted  by  R*g  and  called  the  “mixed  resistance  tensor.” 

3. )  L"*rgy,,  is  denoted  by  rj,  and  called  the  generalized  Christoffel 


NON-RIEMANNIAN  DYNAMICS 


165 


^rmtxil  of  the  “second”  kind.  The  notation  is  not  used  because 
is  not  a  tensor.  Its  transformation  formula  is 


The  Pkiuation  of  Motion  becomes  in  terms  of  contravariant  quantities 


It  can  he  seen  that  Lm/di"  dt  is  replated  by  di^  dt  in  all  equations  and: 


1. )  the  symmetry  of  all  equations  is  maintained, 

2. )  the  simplification  can  be  used  for  all  rotating  machines  simply  by 
multiplying  ail  their  operators  by  the  inverse  of  the  metric  tensor  L^a- 
In  calculating  the  contravariant  operators  of  a  machine  whose  matrix 
of  transformation  is  not  square,  first  its  covariant  operators  LmSt 
Tay.m  etc.  should  be  calculated  and  then  only  its  contravariant  operators 
rj,,  R*9  etc.  (See  section  XXXIV.) 

In  the  accompanying  Table  IV  the  calculated  values  of  the  contra¬ 
variant  operators  of  both  representative  machines  are  given. 

c.)  An  important  physical  interpretation  can  be  given  to  the  inverse 
of  the  metric  tensor.  Let  denote  the  inductance  of  the  stator  wind¬ 
ing  along  the  direct  axis  while  the  rotor  winding  is  short-circuited. 
Then  A  =  L^rs  —  AfJ  *  LrXa.  If  Ms/Lrd  *  Vrd  then  also  can 
be  written  as 


The  equation  sm  e*  »  i*  represents  the  contravariant  impressed 
voltages  as  currents  in  the  various  axes  due  to  covariant  voltages 


tti  A  *  C»i  Lr4*(H4) 


9DTnzai 


.  I  Mtsine/A 


L«iC«**A^  ♦L«««iii'f/BKU./B-L.i/A)»int  CM* 


^«iR*/A |(LM/B-Lai4S}«iii •  CM*  Infill** /A‘«  L*.CN*t<B 


I  *  M.cm*^ 


_ D 

gaggno 
SiTES^^D 
la 
12a 


CM**'»U«*''>**)/i 


ID 

129 


niLr4«il«* 


^^ffl^n[rjwj232EK32!!i 

a?C!!Z9r 


MltwCM* 

B 


r^si>i*/2L 


M4cm«/2L 


»Vc*>*/2L 


9iii*/2L 


IffCmKIDDI 
:9QK9niD| 
MJKtJSSiDi 
aascjibD 

HKIOSl 


CflKSC^Hn 

KBD  1 

IKBD  1 

H 

— 

BWBB 

Mzai 

ISnWilfLLTiLii 

ISS!^I9!!!D| 


IE293!B5Q2i 


thc  contmvariant  operators  or  the  representative  machine 

WnH  STATIONART  COORORMTE  AXES 

table  IV 


166 


NON.RIEMANNIAN  DYNAMICS 


167 


impreawd  while  all  axes  are  short-circuited.  Also  while  the  metric 
tensor  is  the  measure  of  the  permeances  of  the  various  magnetic  circuits 
with  the  windings  open-circuited,  the  inverse  metric  tensor  in  the  measure 
of  the  reluctances  of  the  magnetic  circuits  with  the  windings  short-circuited. 
In  the  hrst  case  the  magnetic  lines  follow  paths  with  the  maximum 
possible  permeances,  in  the  second  case  they  follow  the  paths  with  the 
minimum  possible  permeances. 

d.)  Ecwh  term  of  the  inverse  of  a  dyadic  as  L"*  is  found  as  follows: 

1. )  Replace  each  term  by  the  determinant  formed  when  the  corre¬ 
sponding  row  and  column  are  removed. 

2. )  Multiply  it  by  plus  or  minus  one  according  as  the  term  occupies 
an  even  or  odd  element  counted  from  the  upper  left  hand  comer. 

3. )  Interchange  the  two  indices. 

4. )  Divide  it  by  the  determinant  of  the  dyadic. 

XXXII.  Absolute  or  Covariani  Differentiation 

a.)  It  has  been  shown  in  section  XIII  that  the  derivative  or  the 
differential  of  a  tensor  is  not  a  tensor.  In  textbooks  on  the  Absolute 
Calculus  a  new  type  of  differentiation  is  introduced,  called  the  “absolute” 
or  “covariant”  or  “intrinsic”  differentiation,  denoted  by  6  which: 

1. )  always  produces  a  tensor  from  a  tensor  as  a  result  of  the  differenti¬ 
ation  that  is  it  determines  another  invariant  from  a  known  invariant 
by  a  routine  process 

2. )  also  obeys  all  the  rules  of  ordinary  differentiation,  among  others 
the  following  rule  i(AB)  -*  {6A)B  A(SB) 

3. )  as  a  special  case  reduces  to  ordinary  differentiation. 

The  absolute  differentials  of  tensors  of  rank  one  and  two  are: 


5A*  «  dA"  -f  rjyA^dxf 


121 


iA,  ~  dA,-  rlgAydx^ . ,.122 

6A‘^  -  +  r;tA’'»dx*  -H  .  123 

aA.4  -  -  rltA^  -  r}tA,ydx* . 124 


b.)  The  absolute  derivative  of  i*  with  respect  to  time  is 


168 


GABRIEL  KRON 


If  this  equation  is  substituted  into  the  two  forms  of  the  Equation  of 
Motion  the  contravariant  form  becomes 


e-  -  +  — 

■  1 

The  covariant  form  bec(;me8 

X 

■  ( . 127 

Each  term  in  these  equations  is  a  tensor  aYid  they  express  the  fact 
that  Ihe  theory  of  rotating  marhinery  during  acceleration  M  identical  with 
that  of  ntationary  networks  with  resistances  hnd  inductances  provided 
ordinary  differentiation  is  replaced  by  absolute' differentiation. 

The  tensor  form  of  the  Equation  of  Voltage  of  Maxwell  is 

1  e.  -  K,ai>  +  ^  1 . 128 

61  I 


Lmgbi^j  bt  also  represent  all  the  voltages  due  to  the  presence  of 

flux-lines,  that  is  the  sum  of  the  induced,  generated  and  Coriolis  voltages. 

c.)  The  absolute  derivative  of  the  metric  tensor,  is^zero.  That  is  by 
equs.  124  ,  82  and  8 


iLmS 

'bx* 


bLmS 


-  [abff]  -  [db.a]  -  -  Ttfu  -  0 .  . 


bLmS 

dx* 

,...129 


d.)  Since  the  sum  qf  L^gdi^  dt  and  is  a  tensor,  in  calculating 

the  performance  of  any  particular  rotating  machine  it  is  not  necessary 
to  use  their  transformation  formulae  in  equs.  44  and  46.  i4«  long  as  it 
is  not  necessary  to  calculate  separately  the  induced  and  generated  voltages, 
it  is  sufficient  to  treat  each  expression  in  the  Equation  of  Motion  as  a 
tensor  and  use  the  simple  formula  of  tensor  transformations  (equ.  36). 
This  represents  large  savings  of  mutine  labor. 

If  the  rotating  machine  has  no  moving  coordinate  axes,  then  all  expres¬ 
sions  are  transformed  as  tensors. 

and  the  expression  for  torque  T^s.ti^i*  are  transformed  in  all  cases 
as  tensors. 


NON-RIEMANNIAN  DYNAMICS 


169 


THE  THEORY  OF  NON-ETCLIDEAN  SPACES 

XXX 11 1.  Types  of  Spctces 

a. )  The  theory  of  a  set  of  linear  differential  equations  is  usually 
analysed  in  the  lanf(uaKe  of  multidimensional  Keometry.  The  vari¬ 
ables  are  assumed  to  form  a  curvilinear  coordinate  system  in  an 
n-dimensional  space  or  manifold  and  the  equations  represent  lines, 
two-,  three-  etc.  dimensional  surfaces  curved  in  an  n-dimensional  space. 

By  denying  the  fifth  postulate  of  Euclid  that  only  one  line  can  be 
drawn  through  a  point  parallel  to  a  given  line,  two  consistent  geometries 
have  been  set  up  during  the  last  century,  after  two  thousand  years  of 
unsuccessful  efforts  of  proving  it.  By  assuming  that  more  than  one 
parallel  line  are  possible  Lobachevsky,  Bolyai,  etc.  built  up  the 
“hyperbolic  geometry”  and  by  assuming  that  no  parallel  lines  are 
possible  Gauss,  Riemann,  Klein,  etc.  built  up  the  “elliptic  geometry.” 
Both  geometries  give  as  special  cases  the  Euclidean  or  “paratsdic 
geometry.”  It  seems  that  physical  phenomena  can  be  described  only 
in  terms  of  motions  in  a  space  whose  geometry  is  elliptic.  It  is  the 
geometry  of  the  various  types  of  elliptic  spaces  that  will  be  consid¬ 
ered  below. 

b. )  The  various  types  of  spaces  differ  from  each  other  in  the  definition 
of  two  all  important  concepts.  These  concepts  are  the  “metric"  and  the 
“connection.” 

1. )  To  compare  two  vectors  located  in  one  point  along  different  direc¬ 
tions  an  infinitesimal  length  ds  (calle<l  the  “line  element”)  has  to  be 
defined.  If  <f«*  is  defined  in  every  coordinate  system  by  the  homo¬ 
geneous  quadratic  form 

d«*  “  g,sdx“di^ . 130 

where  gmg  is  a  symmetric  tensor  of  rank  two  (see  equ.  3)  the  space  is 
called  a  “metric  space,”  g^s  the  “metric  tensor”  and  the  “metric.” 
(The  formula  is  a  generalization  of  the  Pythagorean  Theorem  for  n 
infinitesimal  oblique  axes.)  If  is  defined  by  a  different  function  of 
dx*  the  space  is  sometimes  called  “Finsler  space.” 

2. )  To  compare  two  vectors  located  at  different  points  of  the  space  first 
it  is  necessary  to  shift  one  “parallel  to  itself”  to  the  other.  To  visu¬ 
alize  the  problem  let  two  equal  vectors  lie  at  different  points  on  the 
Earth’s  surface.  If  both  lie  in  north-south  direction,  viewed  from  the 
Earth’s  surface  the  difference  between  them  iA*  nuiy  arbitrarily  be 
assumed  to  be  zero  (after  shifting  one  parallel  to  itself  along  a  great 


170 


GABRIEL  KRON 


circle)  or  any  other  quantity.  But  viewed  from  the  outaide  space  the 
difference  between  them  is  something  else,  dA,  a  definite  quantity. 
The  relation  between  the  two  differences  is 

iA*  -  di4-  +  r’fyAfdxy . 131 

where  rj,  is  any  arbitrary  number  system  of  rank  three  (see  equs.  15 
and  16).  Each  coefficient  is  a  function  of  the  variables  dx*  and  is 
called  the  “coefficient  of  connection." 

C.)  If  Tgy  is  defined  in  terms  of  the  metric  tensor  as 

+  . 

the  spare  is  called  ** Riemannian”  (see  equ.  8)  otherwise  “non  Riemannian” 
or  ‘'affine”  (see  equ.  80).  In  a  noh-Riemannian  space  a  metric  may  or 
may  not  be  defined,  also  rj.,  may  be  symmetrical  in  /9  and  y  (so  that 
“  r%)  or  asymmetrical.  A  special  case  of  the  asymmetric  affine 
spare  is  the  ‘‘space  with  torsion”  (equ.  129).  All  rotating  electrical 
machinery  are  equivcdent  to  such  spaces.  A  special  case  of  the  sym¬ 
metrical  affine  space  is  the  “Weyl  space"  and  a  special  case  of  the 
Riemannian  space  is  the  “flat"  space  for  which  the  curvature  tensor 
KaSyt,  defined  later  in  equ.  157,  is  zero.  If  the  line  element  is  definite 
that  is  if  it  is  always  different  from  zero  the  flat  space  is  called 
“Euclidean."  If  in  a  F^uclidean  space  the  axes  are  oblique  instead  of 
curvilinear  each  coefficient  of  the  Christoffel  symbol  is  zero  (equ.  136). 
If  the  axes  are  orthogonal  the  metric  tensor  reduces  to  a  unit  matrix. 

d.)  Among  the  spaces  the  only  ones  that  can  be  visualised  are; 

1. )  The  one-,  two-  and  three-dimensional  Euclidean  space. 

2. )  The  two-dimensional  Riemannian  space  (the  various  types  of 

surfaces).  < 

XXXIV.  Holonomir  Sub-spaces 

a.)  It  is  important  to  note  that  the  properties  of  a  space  are  independ¬ 
ent  of  the  particular  coordinate  axes  from  which  measurements  are  made. 
In  other  words  ftro  spaces  are  equivalent  if  it  is  possible  to  pass  from  one 
space  to  the  other  and  back  by  a  transformation  tensor  Cj  —  d3*/dx"  and 
its  inverse.  If  however  Ci  can  not  be  expressed  as  dx^/dx*  then  no  one-to-one 
relation  exists  between  the  points  of  the  two  spaces  (since  no  equation  can 
be  set  up  between  the  two  sets  of  variables  x*  and  x")  and  the  two  spaces 
are  not  equivalent.  Equivalent  spaces  are  for  instance  the  shunt  poly¬ 
phase  commutator  motor,  fig.  4  and  the  induction  motor,  fig.  2. 


NON-RIEMANNIAN  DYNAMICS 


171 


b. )  If  the  tnairii  of  the  transformation  tensor  C'  —  dz'/dx*  is  not 
square  as  above,  but  rectangular,  then  the  space  which  has  less  coordinate 
axes  is  said  to  be  a  “sub-space”  of  the  other.  For  instance  the  repulsion 
motor,  fig.  6  is  a  subnipace  of  the  induction  motor,  fig  2  or  of  the 
D^ry  motor  fig.  7. 

Among  the  possible  sub-spaces  the  only  ones  visualiiable  are: 

1. )  The  plane  as  a  two-dimensional  Euclidean  sub-space  in  a  three- 
dimensional  Euclidean  space. 

2. )  A  surface  like  the  sphere,  as  a  two-dimensional  Riemannian  sub¬ 
space  in  a  three-dimensional  P'uclidean  space. 

c. )  If  in  the  transformation  tensor  C'  the  number  of  the  old  coordi¬ 
nate  axes  is  n  and  the  number  of  the  new  coordinate  axes  is  m  then 
all  the  polyadics  of  the  old  space  that  have  only  covariant  (lower)  indices 
can  be  assigned  to  the  sub-space  by  transforming  them  with  the  aid  of  the 
formulae  developed  in  the  previous  pages.  For  instance  the  metric  tensor 
L,0  of  the  sub-space  can  be  and  should  be  defined  as 

»  L„C:C; . 133 

where  the  dummy  indices  t  and  e  assume  all  values  frcm  one  to  n 
while  the  free  indices  a  and  0  assume  the  values  from  one  to  m.  The 
same  variation  of  indices  applies  in  the  transformation  of  F,,...  All 
these  polyadics  of  the  sub-space  are  called  “induced”  or  “derived” 
poly^ics. 

Since  the  performance  of  many  rotating  machines  is  expressed  in 
terms  of  induced  polyadics,  their  performance  may  be  considered  as  the 
motion  of  a  particle  describing  a  traiectory  in  an  n-dimensional  non- 
Riemannian  space  but  constrained  to  move  upon  an  m-dimensional 
non-Riemannian  surface  or  sub-space. 

d. )  If  it  is  attempted  to  transform  a  polyadic  which  has  a  contra- 
variant  or  upper  index,  say  F*.  the  value  of  the  inverse  of  C«  is  required. 
But  the  matrix  of  Cl  is  rectangular  and  its  inverse  can  not  be  calcu¬ 
lated  without  a  definition. 

In  case  of  non-Riemannian  spaces  for  which  no  metric  is  defined  it 
is  customary  to  form  a  square  matrix  from  the  rectangular  matrix  by 
assuming  additional  coordinate  axes  outside  the  sub-space  in  the 
enveloping  space  with  ti-m  dimensions.  For  instance  an  additional  set 
of  brushes  may  be  put  on  the  rotor  of  the  repulsion  motor,  fig.  6  (and  a 
winding  on  the  stator  quadrature  axis)  but  they  would  remain  perma¬ 
nently  open-oircuited.  Then  the  new  transformation  tensor  of  the 
repulsion  motor  would  be  similar  or  identical  with  that  of  the  shunt 


172 


GABRIEL  KRON 


polyphase  commutator  motor,  fig.  4,  it  would  have  an  inverse  and  all 
“induced”  contravariant  polyadics  could  be  calculated.  Their  value 
however  depends  on  the  position  of  the  additional  axes. 

e.)  In  electrical  machinery  the  representative  machine  has  a  metric 
tensor  L,,  and  so  the  sub-space  has  an  induced  metric.  It  neems 
obvious  to  define  derived  contravariant  polyadics  for  the  sub-space  simply 
by  raising  indices  with  the  aid  of  L"^.  That  is  for  the  sub-space  is 
defined  as  F^.iL*"'. 

The  above  definition  of  F^^  may  be  taken  also  as  the  definition  of  the 
inverse  connection  tensor  C"  corresponding  to  the  formula 

ris  ~  r;.  c:c;c:  -  (f„.,  c;c;c5)  -  (f;,  lj  cicic:uy 

which  shows  that  the  above  definition  of  F]^^  is  equivalent  to  the  follow¬ 
ing  definition  of  the  inverse  connection  tensor 

C:  =  C;L„.L*> . 134 

can  be  taken  as  the  inverse  of  Cy  since 

c:c:  -  c:c;  L,.L*y  -  . 135 

where  6]  is  a  unit  matrix  with  m  unit  coefficients  in  the  diagonal. 

A’A'A'V.  \ on-holonomir  Sub-spaces 

In  many  cases  in  which  the  number  of  new  coordinates  is  less. than 
the  number  of  old  ctmrdinates  the  transformation  tensor  Cl  can  not 
be  written  as  di’fdr".  Such  a  case  occurs  for  instance  in  the  single¬ 
phase  alternator  fig.  3  where  axis  6  should  be  removed.  (In  general 
moving  coordinate  axes  form  non-integrable  transformation  tensors.) 
I..ately  the  expression  “non-holonomic  sub-space”  has  been  introduced 
to  denote  the  restricted  positions  of  the  particle. 

k'or  both  types  of  sub-spaces  with  each  point  of  the  n-dimensional 
space  there  is  associated  an  m-dimensional  infinitesimal  surface  element 
and  the  particle  must  leave  that  point  along  the  surface  element.  But 
while  in  a  true  sub-space  the  surface  elements  may  be  combined  to  form  a 
family  of  surfaces  so  that  the  particle  can  never  leave  the  particular  surface 
it  happens  to  be  on,  in  a  non-holonomic  sub-space  the  surface  elements 
can  not  be  combined  into  a  surface  and  the  particle  may  pass  from  any  one 
point  of  the  n-dimensional  space  to  any  of  its  other  points. 

To  visualise  the  problem,  in  a  repulsion  motor,  fig.  6  which  is  a  true 
two-dimensional  sub-space  in  a  three-  (or  four-)  dimensional  non- 
Riemannian  space  no  current  (or  rather  no  charges)  can  flow  along 


NON-RIEMANNIAN  DYNAMICS 


173 


axifl  6,  at  right  angles  to  axis  a.  In  a  singl^phase  alternator  where  the 
axis  a  is  rotating  no  axis  can  be  assigned  in  which  no  current  (no  charges) 
could  flow.  Hence  the  single^pbase  «ltemator  is  a  two-  (or  three-) 
dimensional  non-holonomio « <sub-space  of  a  four-dimensional  non- 
Riemannian  space,  that  is  of  the  representative  machine  with  stationary' 
coordinates.  (It  happens  that  it  may  also  be  considered  as  a  holonomic 
sub-space  of  a  four-dimensional  Riemannian  space,  the  polyphase 
alternator,  or  also  as  an  independent  three-dimensional  Rieman¬ 
nian  space. 

•‘It  should  be  remembered  however  that  in  case  of  electrical  machinery 
the  enveloping  n-dimensional  space  need  not  be  considered  even  in  case 
of  non-holonomic  sub-spaces.  Every  m-dimensional  eub-space,  holo¬ 
nomic  or  non-holonomic,  is  in  its  otcn  right  an  m-dimensional  non- 
Riemannian  space  since  the  new  coordinates,  even  though  they  are  npn- 
holonomic,  can  be  calculated  immediately  from  the  dynamical  equations. 

XXXYI.  Spaces  with  Infinite  Dimensions — Hilbert  Spaces 

a. )  If  the  number  of  the  new  c<M>rdinate  axes  m  is  larger  than  the 
number  of  the  old  c<M)rdinate  axes  n,  for  instance  in  case  of  the  fre¬ 
quency  converter  flg.  12,  than  the  same  equations  of  transformations 
may  be  used  as  previously  with  the  tacit  understanding  that  certain 
quantities  are  neglected. 

To  visualize  the  pn>blem  if  three  sets  of  axes  at  120  degrees  apart 
exist  on  the  armature  of  a  three-phase  salient-pole  syrnchronous  machine 
instead  of  two  at  right  angles  then  zen)  phase-sequence  waves  exist 
inside  the  machine  (which  may  be  considered  as  third-harmonic  space 
waves).  However  the  metric  tensor  of  the  representative  machine 
given  in  equ.  3  does  not  contain  zero  phase-sequence  reactances,  hence 
the  final  equations  give  only  a  first  approximation  to  the  actual  phe¬ 
nomena.  A  second  approximation  could  have  been  made  if  the  repre¬ 
sentative  machine  itself  would  have  had  at  least  three  coordinate  axes 
on  each  rotor  layer,  since  then  zero  phase-sequence  reactances  could 
have  been  incorporated  in  the  metric  tensor. 

That  is  a  more  general  representative  machine  has  more  than  two  axes 
on  each  winding  and  the  representative  machine  of  section  IV  is  a 
sub-space  of  this  more  general  representative  machine. 

b. )  In  the  representative  machines  of  this  paper  there  are  two  axes 
of  magnetic  symmetry  assume<l  on  the  stator.  In  actual  machines 
each  stator  and  rotor  tooth  and  slot  is  also  an  axis  of  magnetic  sym¬ 
metry,  hence  to  each  of  them  a  c<H)rdinate  axis  may  be  assigned. 


174 


GABRIEL  KRON 


Since  the  windings  are  distributed  in  slots  the  magnetomotive  force 
wave  introduces  additional  harmonic  waves  in  unbalanced  phase  wind¬ 
ings  etc.  so  that  the  mo$t  general  repreeentative  machine  must  have  an 
infinite  number  of  coordinate  axes  representing  a  space  with  infinite 
number  of  dimensions,  some  axes  being  stationary,  some  moving  with 
the  rotor  and  some  rotating  with  odd  speeds.  Such  cases  have  practical 
importance  especially  in  squirrel-cage  induction  motors  of  smaller  sises 
where  the  harmonics  are  known  to  distort  the  ideal  performance  beyond 
recognition  in  many  cases.  It  is  intended  to  undertake  the  systematic 
dynamical  treatment  of  such  generalised  S3r8tems  with  infinite  coordinate 
axes  in  some  future  paper. 

XXXVII.  Euclidean  Spaces 

a. )  If  the  speed  of  the  rotor  is  assumed  to  be  constant  the  Equation 
of  Voltage  (equ.  17)  can  also  be  written  as 

■*  Zmst*  “  {Ras  +  1*  +  Lm$  . 136 

where  the  terms  in  parenthesis  are  equivalent  to  resistances.  Since 
the  coefficients  of  connection  are  aero,  the  sudden  short-circuit  per¬ 
formance  of  rotating  machines  with  the  speed  maintained  constant  can  be 
represented  by  the  motion  of  a  particle  in  an  n-dimensional  Euclidean 
space  with  oblique  cartesian  coordinate  ares  just  as  if  the  rotating  machines 
were  stationary  networks. 

b. )  If  the  rotor  is  assumed  to  be  stationary  then  in  the  above  equa¬ 
tion  Ffi.atV  is  aero  and  the  transient  impedance  tensor  Z^s  becomes  sym¬ 
metrical.  Such  stationary  electrical  apparatus  are  the  transformer, 
induction  regulator  etc. 

The  transient  impedance  tensor  of  the  single  phase  transformer  is 


P _ » 


n  +  L,p 

Mp 

Mp 

rt-V  Up 

The  transient  impedance  tensor  of  the  three-circuit  transformer  is 


a  b  c 


r,  4- 

M^p 

M^p 

Mi4> 

4"  Usp 

MhrP 

Mr,p 

M^p 

r.  -f  Up 

NON-RIEMANNIAN  DYNAMICS  175 

Similar  matrix  structurea  can  be  set  up  for  other  stationary  electrical 
apparatus  also,  such  as  vacuum  tubes  etc. 

When  resistances,  inductances,  transformers,  vacuum  tubes  etc.  are 
connected  with  rotating  machines  in  any  manner  whatever  each  unit 
is  considered  as  a  rotating  machine  or  better  said  each  rotating  machine 
is  considered  as  a  simple  impedance. 

c.)  With  steady  a-c  voltages  impressed  at  the  terminals  the  variable 
t*  assumes  complex  values  in  each  axis.  The  study  of  such  spaces  is 
outside  the  scope  of  this  paper. 


HINTING 

XXXVIII.  The  Equation  of  SmaU  OsciUatione 

Let  an  infinitesimal  disturbance  der  (infinitesimal  terminal  voltage  or 
shaft  torque)  be  applied  to  any  machine  while  it  is  either  accelerated  or 
is  in  a  uniform  motion.  Then  the  Equation  of  Motion  (equ.  126) 


changes  to 

r-  +  dr-  +  ^  +  dR%)  (i>  +  di<>)  + 

+  (r;,  +  dr;,)  (i*  +  dif)  (!>  -i-  di>). 

If  the  Equation  of  Motion  is  subtracted  from  it  and  all  products  of 
infinitesimals  are  cancelled,  the  Equation  of  Small  Oscillations  in  con- 
travariant  form  is 


*•  +  s  -  « V*'  +  d« V*  -f-  +  r;,dt^>  +  -f  dr;,i*i> 


dx> 


dt 


.137 


In  covariant  form  it  is 


dz*  “  Rmsdt^  +  4-  dL^  ^  4*  r^y.^i^i"*  4=  rtf,,.i*d»'>’  4-  dVfiy.J^i'* 


Ek]U.  137  is  equivalent  to 


176 


GABRIEL  KRON 


or  to 

+  . 140 

XXXIX.  PhyBical  I rUerpreUUion 

a.)  The  hxjuation  of  Small  Oscillationn  also  can  be  divided  into  two 
component  equations,  the  Equation  of  Small  Voltages  by  allowing  the 
free  index  a.aBsume  any  value  but  L  (The  definitions  of  section  VI 
a,  6,  r  are  followed) 

•le.  +  ^  ~  +  r«. .  di>  1' 


+  +  . 141 

In  terms  of  physical  space  vectors  the  equation  becomes 

dc.  +  ^  di^  -  R„  dif  +  d  +  ^1. .  di' . 142 


The  EqucUion  of  Small  Torques  is  found  by  allowing  a  assume  the 
value  t  only 

de,  -  K„  di>  +  In  +  r,,. ,  di»  »>  +  r,,. ,  i>  diy  +  dr^y. , . .  1 43 
In  terms  of  space  vectors  the  equation  is 

de,  »  R„di'  -1-  L,,  +  d\f>y, ,  i'*  +  \/fy, ,  di^ . 144 

(It 

b. )  At  the  instant  of  disturbance  two  sets  of  vectors  can  be  differentiated 
inside  the  machine: 

1. )  the  original  vectors  that  exist  before  the  disturbance  satisfying 
the  hxiuation  of  Motion  (equs.  34  and  35)  and  are  given  in  section  XI 

2. )  a  set  of  infinitesimal  vectors  representing  changes  in  each  of  the 
above  vectors. 

c. )  Some  of  the  vector  changes  can  be  divided  into  two  components. 
Amoitg  the  voltages: 

A. )  the  change  in  the  generated  voltage  vector  can  be  divided  into 

1. )  4't.^i'  “  r$,,,iW  due  to  the  change  of  speed  and  the  original 
flux  density 

2. )  d4',.mi‘  “  rfi,,^i*i*  +  dr^i,,»^i*  due  to  change  of  and  to 
original  t* 

B. )  the  change  in  the  torque  vector  can  be  divided  into: 

1.)  idi'*'  *  T$y,ti^diy  due  to  change  of  current  and  to  original 


NON-RIEMANNIAN  DYNAMICS 


177 


2.)  »  r0y, due  to  change  of  4>y,,  and  to 

original  current. 

A  mong  the  fluxes: 

A. )  The  change  in  the  flux-linkage  vector  »  d{Lm$i^)  can  be 
divided  into  i^dL^^  and 

B. )  the  change  in  the  rotor  flux-density  vector  d4>y,  i  —  »i*)  can 

be  divided  into  dr^y,  ,i*  and  F#,.  idt^ 

C. )  the  change  in  the  Coriolis  flux-density  vector  is  d(^. —  ^y.i)- 
In  the  equations  occurs  only  d^i,,  *  d(r$,,J^)  »  dF^i,,t®  -1-  F^i.^i^ 

To  each  of  the  six  flux  changes  corresponds  a  change  of  voltage  or 
torque.  The  three  changes  due  to  dLmt,  dr^y,,  and  dF^«.,  can  be 
measured  only  by  the  moving  observers.  Hence  for  a  machine  with 
stationary  coordinate  axes  the  Fkiuation  of  Small  Oscillations  reduces  to 

de,  -f-  “*  Rmtidi^  -f-  +  r0y,adi^if  F^,  .  .145 

A  more  detailed  physical  discussion  is  given  in  the  previous  papers. 


XL.  The  Motional  Impedance 

a.)  I.«t  any  machine  with  stationary  coordinate  axes  and  without 
sliprings  be  considered.  If  at  the  instant  of  disturbance  the  machine 
is  in  equilibrium,  that  is  if  all  its  currents  are  known,  the  Equation  of 
Small  Oscillation,  equ.  103  can  be  written  as 


rfe,  »  -b  LasP  +  F^,  ai'  -h  F,a.  ai^)di*. 


The  expression  in  parenthesis  is  a  tensor  of  rank  two  called  the 
“motional  impedance  tensor"  representing  the  opposition  of  a  machine 
to  a  suddenly  applied  infinitesimal  terminal  voltage  or  shaft  torque, 
allowing  its  speed  also  to  var>',  that  is 


Zaa  ~  Bas  “b  Laap  -H  Tay.mi*  F.,a. 


146 


Comparing  it  with  the  “transient  impedance  tensor"  Zas  ^u.  64,  it 
is  found  that  Zas  is  only  a  special  case  of  2ma,  that  is 


Zaa  ~  Zaa  +  Tay.  +  F„.  .»>  +  Tya.  ,iy . 147 

Zas  differs  from  Zas  only  by  an  extra  row  and  column  corresponding  to  the 
additional  axis  t. 

The  Pk)uation  of  Small  Oscillations  becomes 


dea  "  Zasdi* 


148 


from  which  di*  is  found  as  dCaT"^  by  calculating  the  inverse  of  Zaa- 
P'or  the  representative  machine 


NON-RIEMANNIAN  DYNAMICS 


179 


For  any  machine  or  groups  of  nutchines  with  stationary  coordinate 
axes  is  found  by  equ.  68. 

b.)  If  the  machine  has  sliprings  or  revolving  brushes,  but  the  coordi¬ 
nate  axes  are  still  assumed  to  be  stationary,  equ.  145  becomes 

de,  -  -^dx‘  •  -^di» . 150 

and  the  dyadic  in  parenthesis  (not  a  tensor  now)  is  the  motional 
impedance.  In  actual  calculations  the  form  of  the  matrix  may  be 
simplified  by  assuming  all  currents  di*(0  »  d„  q„  q,,  d,)  and  dx*  as 
unknowns.  For  a  salient  pole  B}mchronou8  machine  this  simplified  form 
of  motional  impedance  is  (where  *  e  sin  {$,  —  0)  *  e  sin  6,  e,,  »  e 
cos  6  hence  de^rldx'  ^  —e  cos  4  and  bt^,jbx'  —  t  sin  4). 


NON-RIEMANNIAN  DYNAMICS 


181 


The  currents,  torque  etc.  calculated  by  this  dynamical  method  check 
term  by  term  with  the  expressions  given  by  Park  for  the  salient -pole 
alternator. 


XLI.  The  Criterion  of  Dynamic  Stability 

If  the  coefficients  in  the  motional  impedance  matrix  of  a  machine 
are  all  real  numbers  (as  in  case  of  the  salient-pole  synchronous  machine) 
an  arithmetical  method  can  be  set  up  to  investigate  whether  a  ma¬ 
chine  will  return  to  its  former  equilibrium  position  if  it  is  disturbed 
by  any  cause  whatever. 

The  determinant  of  the  Motional  Impedance  2^  has  the  form 


D  »  a,p"  -I-  oip"-'  +  . . .  rt,  _  ip  +  o, . 152 

If  all  the  real  parts  of  the  roots  of  the  Determinantal  Equation 

-  0 . 153 


are  negative,  each  current  is  of  the  form  all  currents  and 

speed  variations  will  die  out  eventually  and  the  machine  is  djmami- 
cally  stable. 

In  order  that  the  real  pkrts  of  all  nxits  should  be  negative,  the 
coefficients  of  equ.  110  must  satisfy  the  following  conditions  set  up 
by  Hurwitz. 

1. )  All  the  coefficients  Og,  a,,  . . .  must  be  positive. 

2. )  All  the  following  n  determinants  ff)rmed  by  the  coefficients  must 
be  also  pr)sitive 


aiOiOt. . 

.  .OJX-) 

OsOiOi 

o»_» 

OoiOi 

Ou~t 

0 . 

.  .fix 

.154 


where  X  varies  from  1  to  n. 


THE  CURVATT’RE  TENSOR 

XLI  I.  Variable  Character  of  the  Equation*  of  Oscillations 

a.)  A  very  important  pn>perty  of  the  Equation  of  Small  Oscillations 
(equs.  137  or  138)  should  be  noted  as  opposed  to  that  of  the  Equation 
of  Motion: 

1.)  -Vo  term  of  the  equation  is  a  tensor. 


t  1 
it 


#  • 


m. 


182 


GABRIEL  KRON 


2.)  No  combination  of  terma  can  be  set  up  that  are  tensors. 

From  these  facta  it  follows  that  in  case  of  machines  with  moving 
coordinate  axes  each  term  must  be  transformed  by  a  more  or  less  compli¬ 
cated  transformation  formula.  For  instance  dF/yi^i''  is  to  be  trans¬ 
formed  by 

drjxiv  -  -f  d)  -  [(dr;,)  cjcid 


+  ri.dC'cia  +  r;,c;<ic:c;  +  r;,c‘cidci  + 


+  ^<Clji«CJ.>C; . 155 

b.)  However  the  Equation  of  Small  Torques  (equ.  143)  is  a  tensor 
equation.  The  first  three  terms  are  evidently  tensors  and  it  will  be 
proved  now  that  the  sum  of  the  last  three  terms,  representing  a  change 
in  the  machine  torque,  is  also  a  tensor,  that  is  both  the  stationary  and 
the  moving  observers  measure  the  same  change  of  torque,  which  of 
course  must  be  true  from  physical  considerations  also.  By  equations 
42,  55  and  155,  since  C\  >  unity 


rty,idi^if  +  Vfy.ii^di^  + 

-  r.x..c;ci  (di'Cj  -I-  I'dcf)  i-c;  +  r.x.,c;cjt>c;  (dt'C;  +  t'dc;) 

-¥  dr,x..c;c5i'Cji'c;  -i-  r„.4Cic!ii'CU’Ci  +  r,x..c;dcii'c;i'c; 

-  ir„,  tdiU’  -f  r„.,i'dt'  -1-  dr„.,i'i') 

+  r.x.,cjc;  iCidct  d-  dcjcj)  -i-  r,x.iC;c;  (c^dc;  -i-  dcjc;). 

The  last  two  expressions  are  zero  since  the  terms  in  parenthesis  are 

diCiCt)  -  d(l)  -  0. 

c.)  It  is  desirable  to  set  up  the  Equation  of  Small  Oscillations  in  a  tensor 
or  invariant  form,  that  is  in  such  a  form  that  it  should  be  possible  to  pass 
from  one  coordinate  system  to  the  other  by  the  simple  transformation 
formula  of  tensors  (equ.  36)  instead  of  complicated  form^dae  such  as 
equ.  55. 

Referring  back  to  section  XXXVIII  and  to  equ.  140  the  Equation  of 
Small  Oscillations  was  found  from  the  Equation  of  Motion  by  taking 
the  differentials  of  each  term,  that  is  by 

de-  -  d  iR%if)  -f  d(*^) 


NON-RIEMANNIAN  DYNAMICS 


183 


This  form  suggests  that  an  invariant  form  of  the  Equation  of  Small 
Oscillatume  can  be  found  by  taking  the  “absolute'*  differential  of  each  term 
instead  of  the  ordinary  differential.  That  is 


d.)  It  is  necessary  to  have  on  the  right  hand  side  6i“  (or  rather  dx") 
as  the  independent  variable.  Hence  it  is  necessary  to  change 

1. )  the  tensor  6{6i*/it)  into  the  sum  of  two  tensors  so  that  one  of  the 
tensors  should  be  {6/6t)ii“ 

2. )  the  tensor  6{R"fi^)  into  the  sum  of  two  tensors  so  that  one  of  the 
tensors  should  be  R^tfii*. 

XLIII.  The  Generalized  Riemann-Christoffel  Curvature  Tensor 

a.)  In  order  to  find  the  difference  between  6{ii“/6t)  and  {6/ht)hi*  let 
each  be  developed  in  its  component  parts. 


iH*dx^, 


The  expression  in  parenthesis  is  a  tensor  of  rank  four,  is  denoted  by 
K'tyi*  and  called  the  “generalized  Riemann-Christoffel  curvature  tensor’* 
or  shortly  “curvature  tensor.**  It  plays  a  fundamental  r&U  in  finding  the 
conditions  of  integrability  of  a  set  of  linear  differential  equations.  That  is 


Hence 


184 


GABRIEL  KRON 


b.)  The  difference  between  h{R"$i*)  and  is  (iR“$)i*.  But  by 

equ.  123. 

-  6(L‘yRya)i>  -  i6L-')Rygi»  + 

-  (dL->  +  L'^rr.dx*  +  L-'r:.dx‘)  Rygio  +  L->  (-  r;,R,a 

-  r;,Ry,)  i*dx> 

- 

The  expression  in  parenthesis  is  a  tensor  of  rank  three,  is  denoted  by 
Ry  'a*  and  can  be  called  the  “resistance  tensor  of  rank  three”  to  differ¬ 
entiate  it  from  the  resistance  tensor  of  rank  two  /?•«.  That  is 


XLIV.  The  EqxuUion  of  Small  OaciUatums  in  Invariant  Form 

a.)  If  equations  158  and  160  are  substituted  into  equation  156  the 
Fk)uation  of  Small  Oscillations  becomes  in  terms  of  contra  variant  vectors 

be-  +  55dx«  =  -H  +  A'i;tf-iVdx>  +  Ryg-i^dx''  .  161 

OXr  Ol 

In  terms  of  covariant  vectors  it  is 


In  the  equations  bi"  can  not  be  replaced  by  {b  'bt)dz-.  The  two  expres¬ 
sions  are  not  equal  due  to  the  asymmetry  of 

In  terms  of  ordinary  differentials  equ.  161  becomes  (if  the  disturbing 
force  be-  is  left  out  and  only  the  free  variation  of  e-  that  is  (be-/bx‘)dx* 
is  considered) 

de-  +  r;^dx’  -  d{R%if)  + 

-I-  r;,«^dx>  -  r;,R\w  +  didi-)/dt  -i-  | 

-h  Or;,/ax*  +  r:,rj,  -  rr,rj,)  i»i*dxy  j 

-I-  Or;,/dx»  -  ar;,  ax* -i-  rr.rj,  -  r:,ri,)  i«i*dx>  |  Aj ; g-iu^dzy. ..m 


NON-RIEMANNIAN  DYNAMICS 


185 


it  can  be  seen  that  the  equation  in  invariant  form  consists  of  the  previous 
equation  137  with  additional  terms  added  and  the  same  terms  subtracted. 
In  equ.  137  the  terms  could  not  be  grouped  to  form  tensors  but  in  the 
presence  of  additional  terms  representing  hypothetical  voltages  and 
torques  groups  can  be  formed  so  that  each  gmup  of  terms  is  a  tensor. 

b.)  In  calculating  all  the  possible  coefficients  Kt^g"  the  following 
should  be  noted 

1. )  Kgyg"  *  —  This  is  the  only  skewnsymmetry  that  occurs 

in  the  generalized  curvature  tensor. 

2. )  a  is  always  t.  That  is  the  curvature  tensor  is  zero  in  the  voltage 
equation. 

3. )  Either  3  or  7  is  I,  that  is  only  K\y  g*  and  A'iig*  exist  where  also 
A'i;^*«  —  K]ygK  Hence  all  pcjssible  coefficients  can  be  represented  in 
a  matrix  form  K  ,y  g‘  where  only  y  and  0  vary.  All  the  other  (n*  —  n*) 
coefficients  are  zero. 

Hence  the  Equation  of  Small  Voltages  in  invariant  form  is 

ie-  +  ^djc*~  R-giif  -H  . 163 

dJr  61 

and  the  Equation  of  Small  Torques  in  invariant  form  is 

6e‘  -  R‘,ii‘  +  +  A',;i'iVdj>  +  Ryg<i»dxy . 164 

ot 


XLY.  Ricci  Tensors 


a.)  In  general  FJ,  is  a  function  of  all  the  variables  and  it  can  be 
differentiated  with  respect  to  all  the  variables.  In  the  analysis  of 
rotating  electrical  machinery  however  F^.^  can  be  differentiated  only 
with  respect  to  x*  »  d,  since  it  is  a  function  only  of  6.  Due  to  this 
restricted  variation,  in  calculating  any  coefficient  in  any  coordinate 
system  it  is  found  that  two  of  the  terms  in  equ.  167  is  always  zero,  that  is 

A;;i‘  -  -  ^  +  Fi,r5, . 165 

and 


dx* 


-  rii 


F 


X. 

01 


166 


b.)  In  going  over  from  one  coordinate  system  to  the  other  by  a  non- 
holonomic  transformation,  for  instance  from  Aji;*  to  K\,\*,the  right 


186 


GABRIEL  KRON 


hand  aide  of  the  above  equationa  doea  not  behave  aa  a  tenaor,  that  is  the 
transformation  formula  of  /Ci;;'  is 


r*  r'  — 


ax' 


(ri.ri,  -  -  r;,  ^  cz. 


167 


or 


K  -  •  —  r ‘  *  r* 

^tnm  *  ■m  w  • 


168 


Similar  relations  hold  true  in  the  corresponding  terms  of  6(Si*)/6t. 

Certain  polyadica,  among  them  the  curvature  tenaor,  that  behave  aa 
tenaora  if  the  tranaformation  ia  holonomic,  looae  their  tenaor  character  if 
the  tranaformation  ia  non-holonomic.  Such  tenaora  are  coiled  by  Dienes 
** Ricci  tenaora.”  It  can  be  aeen  that  the  tenaora  that  appear  in  the  analysis 
of  infinitesimal  displacements  (hunting)  like  and  i(H’‘)/U  are 

Ricci  tensors. 

c.)  However  their  sum  is  always  a  true  tenaor,  which  can  be  seen  from 
the  following  considerations  by  reference  to  equ.  162. 

1. )  The  last  three  terms  of  equ.  162  belonging  to  are  cancelled 

always  by  a  similar  expression  belonging  to  6(ii*)/6t,  that  is  the  sum  of 
the  two  sets  of  expression  is  always  a  so-called  null-tensor  with  all  its 
coefficients  zero. 

2. )  The  remaining  expression  dr0y/dx‘  belonging  to  always 

forms  a  tensor  with  the  terms  TfydiH^  -H  FgyMi^  belonging  toi(4t*)/i< 
as  is  proved  in  section  XXXVIII  b. 

Hence  if  the  values  of  X  i ;  and  h(hi")/ht  are  calculated  for  any 

coordinate  system,  their  value  can  be  found  for  any  other  coordinate 
system  as  if  each  of  them  were  tensors.  This  procedure  is  similar  to 
that  followed  in  the  calculation  of  the  Equation  of  Motion  in  finding 
separately  di"/dt  and  as  if  each  of  them  were  tensors  (sec¬ 

tion  XXXII  d). 


XLVI.  Calculation  of  the  Curvature  Tensor 

a.)  For  the  representative  machine  with  moving  coordinate  axes  the 
coefficients  of  the  curvature  tensor  as  calculated  fit^m  equ.  165  are 


iT 


NON-RIEMANNIAN  DYNAMICS 


189 


The  difference  between  equs.  170  and  171  ought  to  be  according  to 
e<|U.  167 


(i,  (I,  q. 


The  sum  of  equs.  172  and  171  is  actually  equ.  170. 
c.)  The  coefficients  of  are  too  lengthy  to  be  published  here. 


DIFFERENTIAL  GEOMETRY 


XLVII.  The  Geometry  of  Holonomic  Dymmics 

a.)  In  a  Riemannian  space  there  are  two  lines  of  great  importance. 
They  are: 

1.)  the  geodesic  line  (the  tangent  to  the  curve  is  always  “absolutely” 
parallel  to  itself).  Its  equation  is 


173 

2.)  a  line  at  an  infinitesimal  distance  from  a  geodesic  line.  Its  equa¬ 
tion  is 


174 


b.)  The  set  of  differential  equations  of  holonomic  dynamics  that  are 
derived  from  the  Px]uation  of  Motion  of  Lagrange  can  be  considered  as 
representing  a  particle  describing  a  trajectory  (path)  under  the  action 
of  a  force  /•  in  an  n-dimensional  Riemannian  space.  The  equation  of 
the  trajectory  is 


\  dx^  dx^ 
at*  dt  dt 


175 


The  effect  <if  a  sudden  infinitesimal  disturbance  is  to  force  the  particle 
into  a  path  infinitely  close  to  its  previous  path.  Eventually  this  new 
path  may  diverge  by  a  finite  amount  from  its  undisturbed  path.  The 
equation  of  the  disturbed  path  (as  given  by  Synge)  is  for  a  Rieman¬ 
nian  space 


190 


GABRIEL  KRON 


This  equation  also  can  be  written  as 


8  rfx* 


di*  dx^  j  ,  ^ 

- dx^  —  —  djT 

dt  dt  8x* 


0 


177 


since  8{dx")  is  interchangeable  with  d{8x“)  in  a  Riemannian  space. 

c.)  If  it  is  assumed  that  the  particle  is  subjected  to  a  frictional  force 
proportional  to  the  velocities  along  the  various  axes,  the  equation  of  the 
trajectory  is 


/.  =  ft-g 

+^+ 

f  a  1  dx^  dx'' 

^  (It 

of 

[iSyJ  at  at 

178 


and  the  equation  of  the  disturbed  path  is  (replacing  8(R*^  by  its  two 
components) 


8  .dx*  .  ,dj*dx^  dx*  8e“  .  . 


.  179 


d.)  Ihese  equations  are  identical  in  form  with  the  Fxiuation  of 
Motion  and  the  Equation  of  Small  Oscillations  developed  in  the  paper 
except  I i  is  replaced  by  Since  in  the  equations  of  the  machines 
is  asymmetrical,  the  analogous  dynamical  problem  can  be  stated 
as  follows: 

The  equation*  of  rotating  electrical  machinery  during  acceleration*  are 
identical  with  the  equation*  of  a  particle  motnng  in  an  n-dimeneional  non- 
Riemannian  apace  with  oeymmetric  connection  acted  upon  by  a  poeitional 
force  and  opposed  by  a  frictional  force  proportional  to  it*  inetantaneou* 
velocity. 


XLVIII.  The  Geometry  of  Rotating  Electrical  Machinery 

a.)  The  line-element  of  the  surface  on  which  the  particle  moves  is 
defined  as 


d«»  -  L^x*dx^ . 180 

As  long  as  the  metric  tensor  of  the  surface  Lm$  and  the  coefficients  of 
connection  r«0.  y  are  defined  in  any  arbitrary  manner,  it  i*  not  nece**ary 
to  know  the  relation*  between  the  variable*  x*.  All  (metric)  properties  of 
the  surface  and  of  vectors  on  the  surface  in  the  neighborhood  of  a  point 
can  be  studied  with  the  aid  of  the  two  sets  of  quantities  Lm*  and  r«a. -r 
one  having  n*  the  other  n*  terms. 


NON-RIEMANNIAN  DYNAMICS 


191 


b.)  The  projection  of  the  inetantaneous  pontion  of  the  particle  along 
the  curvilinear  coordinate  axes  represent  the  total  charges  that  passed 
through  each  winding. 

The  unit  tangent  vector  to  the  trajectory  is 


X* 


dl" 

d» 


181 


c.)  The  velocity  vector  is  along  the  tangent  of  the  trajectory,  that  is 


dz"  ^  dor*  dt 
dt  do  dt 


X*t> 


182 


The  projection  of  the  velocity  vector  along  the  coordinate  axes  repre¬ 
sents  the  instantaneous  currents  flowing  in  each  winding.  The  magni¬ 
tude  of  the  velocity  vector  is 


p» 


df* 


,  dx"  dx* 


L^i*i» 


2T 


183 


That  is  the  square  of  the  velocity  vector  is  equal  to  twice  the  kinetic 
energy  stored  in  the  machipe  at  that  instant. 

d.)  The  acceleration  vector  tr  can  be  divided  into  R"ti^  which  is  in  a 
general  direction  and  into  ii*/U 

where 


V  -  ^  +  n^X'X’  -  ^ . 184 

d«  he 

hn"  is  the  “first  curvature  vector,”  m*  >*  Ihe  “principal  normal”  and  h 
is  the  “first  curvature.”  Hence  the  voltage  vector  hi*/ht  (acceleration 
vector)  due  to  the  presence  of  flux  lines  lies  in  the  plane  of  X*  and 
s.)  The  power  is  represented  by  the  product  of  the  acceleration  vectors 
and  their  projection  along  the  velocity  vector  X*.  Since  the  power 
represented  by  T^i*iH^  is  zero  (equ.  107)  the  generated  voltage  vector 
due  to  the  motion  of  rotor  conductors  and  the  torque  vector  must  lie 
in  a  plane  perpendicular  to  the  direction  of  motion  that  is  to  X*. 

Hence  the  product  of  hi"/ht  and  its  projection  on  X*  gives  dT/dt  while 
the  product  of  e“  and  its  projection  on  X*  gives  the  power  input. 


192 


GARRIEI.  KRON 


f.)  It  18  intemting  to  note  that  all  phyaical  space  vectors  actually 
exiatinK  inside  of  all  machines  at  each  instant  (current-density,  flux- 
density,  voltage)  are  represented  geometrically  by  vectors  located  in 
an  n-dimensional  space  or  upon  an  m-dimensional  surface  in  an  n-dimen- 
sional  space  (velocity,  acceleration).  The  space  or  surface  itself  is 
represented  by  the  charges  that  do  not  form  a  space  wave  inside  the 
machine.  Also  all  the  properties  of  the  space  itself  and  of  the  machine 
can  be  investigated  if  the  ^metric  and  the  coefficients  of. connection  are 
given,  the  particle  and  the  machine  are  at  rest  and  no  vectors  appear 
ip  the  space  or  inside  the  machine.  Vectors  do  appear  in  the  spare  and 
inside  the  machine  only  when  a  voltage  is  applied  to  the  machine  and  the 
particle  begins  to  move  under  the  action  of  a  force,  the  vectors  showing  the 
insi^taneoUH  velocity,  and  acceleration  of  the  particle.  The  manner 
of  the  motion  of  the  particle  of  course  depends  on  the  type  of  the  applied 
forces  but  the  characteristic  pmperties  of  the  space  itself  in  which  the 
motion  is  described  are  independent  of  the  appliecl  forces,  they  only 
depend  on  the  metric  and  connection. 

BIBLIOGRAPHY 

The  theory  of  a  set  of  linear  differential  equations  has  been  attacked 
from  several  point  of  views,  algebraic,  geometrical,  dynamical  etc.  All 
|N>int  of  views  may  use  scalar,  tensor  or  other  symbolism. 

Scalar  notation  . 

A.)  Algebra.  The  theories  developed  in  them  may  be  used  as  labor-saving 
devices  especially  in  the  analysis  of  the  “motional”  and  “transient"  impedance 
tensors  and  in  their  synthesis,  due  to  their  matrix  form 
RAcher:  Introduction  to  Higher  Algebra. 

Turnbull:  The  theory  of  Determinants,  Matrices,  Invariants. 

'  '  B.)  Differential  Geontetry.  They  treat  the  theory  of  two-dimensional  sur¬ 
faces. 

.  £isenhart:  Differential  Geometry  of  Curves  and  Hurfaces. 

Weatherburn ;  Differential  Geometry  of  Three  Dimensions  (uses  Gibbs’ 
dyadic  notation). 

C.)  Dynamics.  The  literature  of  the  highly  special  dynamical  systems  to 
which  rotating  electrical  machinery  belong  is  very  scarce. 

1. )  Hotonomie  dynamical  systems  with  mechanical  and  electromagnetic  energy 
are  treated  in 

^  .  Maxwell:  Electricity  and  Magnetism.  Vol.  II. 

Thomson,  J.  J.:  Application  of  Dynamics  to  Physics  and  Chemistry.  ’  ' 

2. )  Non-holonomic  dynamical  systems  are  treated  in  ■ 

*  ''' Whittaker:  Analytical  Dynamics.  (Page  4&) 

3. )  Quasi -hotonomie  dynamical  systems  are  analyzed  (not  by  the  dynamical 
equations  however). 


NON-RIEMANNIAN  DYNAMICS 


193 


Appell:  Sur  um  forme  K^nfrale  des  Equations  de  l«  dynamique.  Memorial 
dea  arienres  math.  Faar.  1.  1925. 

Lorenti:  Die  Maxwell'sche  Gleirhungen.  EnryclopAdie  der  Math.  Wis> 
aenaohaften.  V.  2. 


Teruor  $ymboli»m 

A. )  Differential  Geometry:  Multidimenaional  aurfarea  are  analyzed  usually 
in  ronneotion  with  the  Theory  of  Relativity. 

1. )  Riemannian  apaoea  are  treated  in 

Eiaenhart:  Riemannian  Geometry  (Princeton  I'niversity  Preaa). 

Struik:  GrundzURe  der  .Mehrdimenaionaleii  DifferentialReometrie. 
Duachek-Mayer:  I.ehrburh  der  DifferentialReometrie. 

Iievi>Civita:  The  Ahaolute  Differential  Calrulua. 

2. )  Non-Rtemannian  spaces  with  aymmelric  connection  are  treated  in 
EkldinRton:  The  Mathematical  Theory  of  Relativity. 

Weyl:  Space,  Time,  .Matter. 

3. )  Non-Riemannian  spaces  with  uaymmelric  connection  are  treated  in 
Eiaenhart:  Non-Riemannian  Geometry  (Am.  Math.  Hoc.). 

Hchouten:  Der  Ricci-Calcul  (HpriiiRer,  Berlin). 

4. )  Non-Riemannian  spaces  with  toraion  and  a  metric  are  treated  in 
Hayden:  Huh-epaces  of  a  space  with  torsion.  Ixindon  Math.  Hoc.  Proc. 

V.  .34.  1932. 

B. )  Dynamics:  The  literature  of  conservative  dynamical  systems  in  tensor 
symbolism  is  very  extensive.  For  non-conservative  dynamical  systems  the 
literature  is  scarce. 

1. )  Holonomic  dynamical  systems  are  treated  in  a  fundamental  paper  by 
Hynge:  On  the  Geometry  of  Dynamics.  Royal  Hoc.  of  Ix>ndon,  Phil.  Trans. 

A.  1926. 

2. )  Non-holonomic  dynamical  systems  have  l>een  treated  only  during  the  last 
five  years. 

Hynge:  Geodesics  in  non-holonomic  Geometry.  Math,  .\nnalen.  Bd.  99. 
1928. 

Hchouten:  t^er  nicht-holonomen  t*bertragungen  in  einem  Lm.  Math. 
Zeitschrift.  Bd.  30.  1929. 

Dienes:  On  the  fundamental  formulae  of  the  geometry  of  tensor  sub-mani¬ 
folds.  Journal  de  math,  puree  et  appliquees.  Her.  9  tome  11.  1932. 

.4s  introductory  lMM>kB  to  the  Absolute  Calculus  may  be  consulted: 

Veblen:  Invariants  of  quadratic  differential  forms.  Cambridge  Tracts  in 
.Math.  No.  24. 

Thomas,  T.  Y.:  The  elementary  Theory  of  tensors.  (McGraw-Hill.) 
McConnell:  Applications  of  the  .\bsolute  Differential  Calculus.  (Blackie 
&  Hon,  Toronto.) 

Articles  on  rotating  electrical  machinery  referred  to  in  the  paper  are 
Park:  Two-reaction  theory  of  synchronous  machinery— I.  Trans.  Am. 
Inst.  Elec.  Eng.  June  1929. 


194 


GABRIEL  KRON 


Kron:  Generalised  theory  of  electrical  machinery.  Trana.  A.  I.  E.  E. 
June  1930. 

Kron;  Tenaor  anal3raia  of  rotating  machinery — I.  Winter  Convention 
A.  I.  E.  E.  1933.'  • 

Doherty  A  Nickle:  Synchronoua  machines — I.  Trana.  A.  I.  E.  E.  Vol.  45. 
1926. 

Lyon:  Transient  conditions  in  electrical  machinery.  Trans.  A.  I.  E.  E. 
Vol.  42.  1923. 

Arnold;  Die  Wechselstmmtechnik. 

The  dynamical  theory  of  the  altematw  as  given 'hy  Maxwell  appeared  in 
Ingram:  The  dynamical  theory  of  a*c.  machinery.  Jour.  Franklin  Inst. 
Sept.  1930. 

Dahlgren;  A  general  electromagnetic  theory  of  electric  machines.  Ingenidrs 
Vetenakaps  Akademien.  Handlingar  Nr.  99.  1930. 

Basilewitch:  To  the  problem  of  general  theory  of  electrical  machinery. 
Electritcheatvn.  Jan.  1930. 


THE  STRESS  DISTRIBUTION  IN  LONGITUDINAL  WELDS 
AND  ADJOINING  STRUCTURES 

Bt  William  Hovoaard* 

1.  INTRODUCTION 

This  paper  is  an  account  of  a  theoretical  and  experimental  study  of 
the  stress  distribution  in  longitudinal  or  side  welds  and  in  the  struc¬ 
tures  which  they  connect.  The  work  was  carried  out  at  the  Massa¬ 
chusetts  Institute  of  Technology  during  the  years  1930  to  1933,  either 
directly  by  the  author  or  by  students  under  his  guidance,  largely  also 
under  the  guidance  of  Dr.  Heinrich  Hencky.  The  investigation  cen¬ 
tered  in  the  simple  and  fundamental  case  where  a  plate  subject  to  a 
longitudinal  pull  is  reenforced  by  a  bar  or  by  a  plate  strip,  which  is 
placed  flat  on  the  plate  or  normal  to  it  and  connected  to  it  by  longi¬ 
tudinal  fillet  welds.  The  plate  is  often  part  of  a  larger  structure,  and 
is  referred  to  in  the  following  as  the  “plate.”  In  experimental  work 
it  is  necessary  to  fit  the  reenforcing  member  symmetrically  in  two 
parts,  one  on  each  side  of  the  plate  in  order  to  avoid  bending  stresses, 
but  we  shall  generally  refer  to  them  as  one  unit:  “plate  strip,”  “bar,” 
or  “web,”  as  the  case  may  be. 

Figs.  1  and  2  illustrate  the  two  cases,  which  form  the  subject  of  the 
following  investigation. 

Referring  to  Fig.  1  it  is  at  once  clear  by  symmetry  that  there  can  be 
no  shearing  stress  in  the  weld  at  the  middle,  0,  and  that  we  need  only 
consider  one  end  of  the  structure,  say,  from  O  to  D.  When  the  plate 
elongates  under  a  pull,  the  bar  will  resist  the  motion  and  each  point 
of  the  bar,  with  the  exception  of  0,  will  be  a  little  displaced  relative  to 
the  corresponding  point  in  the  plate,  the  displacement  increasing  from 
sero  at  0  to  a  maximum  at  the  end  of  the  bar.  The  shearing  stresses 
in  the  weld  will  likewise  increase  from  the  middle  toward  the  end. 
Within  any  transverse  section  the  stresses  and  the  displacements  rela¬ 
tive  to  the  middle  section  vary  from  point  to  point  both  in  the  plate 
and  in  the  bar,  but  the  author,  in  his  preliminary  treatment  of  this 
problem,  made  the  assumption  thcU  the  ctverage  dirplacement  in  any 
Mcticn  of  the  bar  relative  to  that  in  the  »ame  section  of  the  plate,  is  pro- 

‘  MsMschusetts  Institute  of  TechnoIoKy. 

195 


HTRES.S  UIMTRIBl'TION  IN  WELDS 


197 


porlitmal  to  the  shearing  stress  in  the  weld  at  that  section.  The  shearing 
fltresB  in  the  weld  was  reckoned  per  eq.  in.  of  the  throat  section.  (>n 
this  basis  the  author  developed  a  theory  for  the  stresses  in  the  weld, 
first  published  in  the  Pn)ceedinKs  of  the  National  Academy  of  Sciences 
for  November  1930.  We  shall  briefly  describe  this  solution,  as  appliecl 
in  particular  to  the  test  piece  given  in  Fig.  1. 

The  plate  is  subject  to  a  uniform  tensile  stress  p  at  the  ends  and 
tends  to  elongate  uniformly  over  its  entire  length,  but  in  this  it  is  pre¬ 
vented  by  the  bar,  which  causes  a  special  state  of  stress  in  the  whole 
structure.  It  is  the  object  of  the  present  paper  to  study  that  state  of 
stress.  The  origin  is  at  0  and  the  axis  of  OX  falls  along  the  centre 
line  of  the  plate.  We  use  the  following  notation: 


2  L 
A 
a 
1 
7. 


Vt 

P. 


M 


E 


total  length  of  bar 

sectional  area  of  plate 

sectional  area  of  bar 

throat  area  of  weld  per  unit  length 

shearing  stress  per  square  inch  on  throat  area  of  weld  at  the 
point  X 

average  tensile  stress  in  plate  across  a  section  at  x 
average  tensile  stress  in  bar  across  a  section  at  x 
average  linear  displacement  of  a  transverse  section  of  the  bar 
relative  to  that  of  the  corresponding  section  of  the  plate 
,at  X 

—  *  displacement  coefficient:  the  rati<i  between  the  average 
displacement  of  the  bar  relative  to  the  plate  and  the 
shearing  stress  in  the  weld 

M(xlulus  of  elasticity,  assumed  to  be  the  same  in  bar,  plate 
and  weld. 


It  was  realized  that  actually  the  stresses,  and  hence  the  elongations 
and  displacements,  are  unevenly  distributed  acn)S8  a  transverse  section 
in  the  plate  as  well  as  in  the  bar,  and  that  the  displacement  of  the  bar 
relative  to  the  plate  is  due  partly  to  strains  in  those  members  and 
partly  to  strains  in  the  weld  material;  but  as  a  first  attempt  and  in  the 
absence  of  any  accurate  knowledge  of  the  stress  distribution,  it  was 
assumed  that  the  ratio  between  U,  and  q,  could  be  expressed  by  a  con¬ 
stant  or  at  least  an  average  value  of  the  ‘‘displacement  coefficient,”  m- 
The  tensile  load  across  any  section  of  the  plate  is  equal  to  the 
load  pA  outside  the  bar  minus  the  aggregate  pull  of  the  weld  outside 
that  section. 


196 


WILLIAM  HOVGAARD 


Hence: 


Since  the  tensile  load  acrues  any  section  of  the  bar  is  due  entirely 
to  the  pull  of  the  weld  outside  that  section,  we  have: 

*-~l  . 

It  follows  that : 


P.  “  P  - 


(3) 


The  average  displacement  of  the  bar  relative  to  the  plate  at  x  must 
be  equal  to  the  difference  between  the  elongation  of  the  plate  and  the 
bar  from  O  to  x.  Hence  putting 


we  get: 


(4) 


7  (i4  +  o) 
BAa 


(5) 


From  (4) 


(6) 


In  the  original  paper  the  author  used  the  Principle  of  Least  Work  and 
the  Method  of  Variation  to  find  the  shearing  stress  q,]  but  we  shall 
here  follow  a  somewhat  simpler  method  suggested  by  Ck)mmander 
H.  E.  Rossell  of  the  United  States  Navy,  Professor  of  Naval  Con¬ 
struction  at  the  Massachusetts  Institute  of  Technology,  and  leading  to 
the  same  result. 

Differentiating  (5)  and  (6)  and  combining: 


dx*  p 


m*z 


(7) 


M 


STRESS  DISTRIBUTION  IN  WELDS 


199 


where 


The  solution  of  (7)  is; 

z 

Therefore 


m* 


>  M  4-  q) 
<lA  (lE 


P 

urn'E 


-f  +  .Sr-"* 


(8) 

(9) 


dz 

dx 


—  9,  «  mlie"*  —  m.Sc""' 


(10) 


Making  use  of  the  conditions 

X  -  0,  9,  -  0  I 

[  which  give  ft  *  S 

X  -  L,  9.  -  91.  J 

we  find: 


Ql 

2m  sinh  mL 


(11) 


(9)  and  (11): 


(4)  and  (12): 


(13)  (14): 


Therefore 


Ql  sinh  mx 
sinh  mL 


z  *  -j-  2ft  cosh  mx 

nm'tt 

p  Ql  cosh  mx 

tim'E  m  sinh  mL 


z 


/; 


9l  sinh  mx 
sinh  mL 


dx 


Ql 

m  sinh  mL 


(cosh  mL  —  cosh  mx) 


(12) 


(13) 


(14) 


9l  ooeh  mL  p 
m  sinh  mL  nm*E 


91. 


p  tanh  mL 
ftm  E 


(15) 


which  is  the  same  as  given  in  previous  papers  by  the  author. 


200 


WILLIAM  HOVGAARD 


l-herefore 


From  (1) 


9. 


p  Mtnh  mx 
mutl  ooeh  mL 


(12') 


+  a)  L 


p  8inh  mx 
mnE  cosh  mL 

ocish  mx 


dx 


A  a 


cosh  mL 


(16) 


I'he  Rtretw  in  the  bar  is  found  fn>m  the  fundamental  equation  of 
equilibrium: 


j(p-  p.) 


^  (p  -  p.)  . (17) 

It  is  of  interest  to  note  that  the  total  load  transmitted  to  the  web  is: 


yq/ix 


mtiE  ooeh  mL 
p  Aa 


sinh  mx  dx 


Ir' 

0  ooeh  mL^ 


(i4  4-  a) 

When  the  sectional  area  of  the  web  is  equal  to  that  of  the  plate, 
a  *  j4,  we  have: 

Q  .  P  d  f  1 - U) 

2  \  ooehmL/. 

When  m  *  0,  (m  *  *)  and  therefore  Q  *  0;  when  m 


'■t  M 


pA 


and  Q  »  whence  it  appears  that  not  more  than  one-half  of  the 
£ 

load  can  be  transmitted  to  the  web  in  that  case. 

The  mathematical  solution  so  obtained  is  very  simple  in  form, 
although  admittedly  apprtjximate,  and  it  does  not  give  any  information 
about  the  stress  distribution  in  detail.  In  order  to  apply  it  to  prac¬ 
tical  problems  it  is  necessary  to  know  the  value  of  the  displacement 
coefficient  for  any  given  joint  and  whether  it  is  actually  constant  for 
all  values  of  x  in  the  same  joint.  It  was  primarily  for  the  purpose  of 
investigating  these  questions  that  the  research  work  reported  in  the 
following  chapters  was  undertaken. 


STRESS  niSTRIMl  TION  IN  WELDS 


201 


II.  EXPKHIMKNTS  MADE  BY  THE  AI  THOK  IN  THE  YEARS  lO.'lO  AND  1931 

The  principnl  objopt  of  thr-sp  pxppriinpnts  wju<  to  study  thp  displacp- 
nipnt  copfficipnt  n  in  various  j(*ints,  but  aftpr  prplinunarv'  attpiupts  with 
tpst  pipcps  of  diffprpnt  typps  it  wius  dpcidpil  to  concpntratp  on  thp 
spppiinpn  illustratp<l  in  P'in.  1.  This  .sppoinipn  was  of  mild  stppl  and 
consistp<l  of  a  namiw  platp  9  ft.  long,  on  pach  .sidp  of  which  wa.s  laid  a 
still  narrowpr  platp  strip  of  onp-half  thp  thicknpss  of  thp  platp,  about 
5  ft.  in  Ipn^th,  connpctp<l  to  thp  platp  by  four  continuous  fillet  welds. 
KiR.  3  shows  a  photoRraphic  picture  of  the  specimen  with  fittinRs  for 
strain  mejisurements.  We  rPRard  the  tw«)  .strips  as  one  and  refer  to 


them  as  the  bar,  and  also  the  four  lines  of  weld  are  in  the  analysis 
rpRarded  jis  one. 

The  displacement  of  (stints  on  the  bar  and  on  the  plate  chtse  to  the 
weld  was  measured  directly  at  both  ends,  A  and  F,  and  at  two  inter- 
metliate  (stints  (tn  each  side  «tf  the  middle.  Displacements  were  also 
obtained  less  directly  fntm  measurements  of  .strains  ahmR  the  center 
line  of  the  bars  and  ahtiiR  <tther  lines  <tn  the  bars  and  the  plate  parallel 
with  the  center  line.  'Phe  strains  were  measured  partly  with  20-inch 
and  partly  with  2-inch  Berry  RaRPs,  the  latter  beinR  applied  at  five 
transverse  stations  on  each  of  which  seven  mea-surements  were  taken 
on  each  side  of  the  s()ecimen. 

The  s(x*cimen  was  placed  in  a  400,(HK)  lbs.  Kmery  testinR  machine 


k. 


WILMAM  HOVr.AARI) 


2<)2 

and  Hubjecte<l  t«>  tenaums  increaainK  from  a  .small  initial  load  up  to 
about  200, UOU  lbs.,  coirespondinK  to  average  Btresses  in  the  plate 
beyond  the  ends  of  the  bar  of  about  45,000  lbs.  i)er  wj.  in.,  while  the 
UKKrt'KiOe  section  of  plate  and  bar  was  stressed  to  an  avenme  of  about 
.‘10,000  lbs.  per  s(|.  in.  The  teat  piece  was  just  able  to  carry  this  load, 
but  when  RoinR  .somewhat  beyond  this  point,  the  elongation  increased 
rapidly  and  at  2.50,0(K)  ll)s.  the  plate  fractured  at  the  lower  end  throuKh 
the  h(»les  that  were  drilled  for  the  ineasurinK  fixtures.  It  appeared 
that  at  a  loa<l  of  105,000  lbs.  the  entire  specimen  was  still  within  the 
elastic  limit,  and  the  analysis  was  carrie<l  out  chiefly  for  this  l«)ad. 

Using  the  symbols  define<l  in  (’hapter  I,  the  numerical  data  were 
as  follows; 

A  =  4. .50  stj.  in.  L  =  .30.25  in. 

a  —  2.25  scj.  in.  E  =  .30  X  10*  lbs.  per  scj.  in. 

7  =  .5;i0  s<j.  in. 

The  average  of  the  displacements  were  computeil  from  the  strains 
for  three  sections,  placet!  respectively  10  in.  and  20  in.  from  the  middle 
and  at  the  ends,  each  average  being  a  mean  of  eight  readings. 

For  a  load  of  H'  =  105,000  lbs.  the  stress  in  the  plate  beyond  the 
ends  of  the  bar  was: 

p  =  — — —  =  2:1,300  lbs.  per  sq.  in. 

4,0 

'Fhe  average  displacement  of  the  bar  relative  to  that  of  the  plate 
was  obtained  by  graphical  and  numerical  computations  from  the  ob¬ 
served  data,  ('urv’es  were  constructed  from  the  strains  obtained  with 
the  2-in.  gages,  and  these  curv’es  when  integrated  gave  the  displace¬ 
ments  of  points  on  the  bars  or  the  plate  relative  to  the  middle.  3'hus 
V t  could  be  found,  but  the  result  was  not  quite  satisfactory  l)ecause 
the  part  of  the  plate  under  the  bars  was  not  included,  being  inaccessible 
to  mea.surements. 

‘Phe  local  mea.surements  taken  with  2-in.  Kerry  gages  were  supple- 
menteil  by  measurements  with  the  20-in.  gages,  which  furnished  dis¬ 
placements  more  directly  for  the  bar  at  the  centre  line  and  for  the 
plate  at  the  outer  edges. 

I'he  relative  displacements  of  bar  and  plate  measuretl  across  the 
weld  by  Huggenberger  extensometers  and  special  fittings  were  com¬ 
bined  with  those  obtained  by  the  strain  gages. 

In  the  exjx'rimental  work  the  author  had  the  assistance  of  Pnifessor 
I.  H.  ('owdrey  and  IVofessor  li.  G.  Adams. 


STRE88  DISTRIBUTION  IN  WELDS  203 


In  all  more  than  2000  meaaurementn  were  taken,  but  the  results  of 
the  analysis  were  disappointing,  although  they  threw  some  light  on 
the  problem  and  on  the  technique  of  carrying  out  the  experiments. 
In  spite  of  great  accuracy  in  the  manufacturing  of  the  test  piece  and 
in  the  fittings  used  for  applying  the  gages,  and  in  spite  of  careful 
annealing,  the  displacements  finally  obtained  were  not  quite  con¬ 
sistent.  This  is  ascribed  to  the  fact  that  the  measurements  were  neces¬ 
sarily  made  on  the  surface  of  the  various  members  and  that  hence  the 
major  part  of  the  body  of  the  structure  escaped  observation.  This  can 
be  seen  from  the  section  of  the  test  piece  given  in  Fig.  1. 

The  measurements  seemed  to  indicate  that  the  bars  elongate<l  more 
than  the  plate,  a  fact  which  can  only  be  explained  by  an  increased 
elongation  of  the  plate  in  the  region  along  the  centre  line  between  the 
welds,  which  region  was  hidden  by  the  bars. 

If  we  accept  the  theory  given  in  Chapter  I  as  an  approximate  solu¬ 
tion,  it  is  possible  to  obtain  a  value  for  the  displacement  coefficient. 

Consider  the  middle  section,  O,  where  the  strains  may  be  a88ume<l 
to  be  most  uniformly  distributed.  From  the  average  measured  strain 
in  the  bar  we  find  the  average  stress  po  by  multiplying  with  E.  But 
from  (2),  (12')  and  (8): 


Po  “ 


Ap  (cosh  mlj  —  1) 
(-4  +  a)  cosh  mL 


(18) 


cosh  mL  —  1  (.4-1-  a)  Po 

cosh  mL  A  p 


(18') 


from  which  m  can  be  found  and  then  n  from  (8): 


7  (.4  -1-  c) 
am* A  E 


(8') 


Substitute  now  numerical  values: 

The  unit  average  strain  in  the  bar  at  O  was  measured  to  be  5  X 
and  hence 


p*  -  5  X  10-<  X  30  X  10*  -  15,000  lbs.  per  sq.  in. 
The  load  taken  by  the  bar  at  the  middle  was  therefore 
P,  »  apQ  -  2.25  X  15,000  -  33,750  lbs. 


Stress  in  plate  at  middle: 


204 


WILLIAM  HOVGAARD 


which  shows  a  fairiy  uniform  distribution  of  stresses  in  plate  and  bar. 
Frt)m  (18'): 


cosh  mL  ~  1  _  8  15,000 
coshmL.  2  23,300 

which  ftives  cosh  mL  »  16.9,  mL  »  3.52 


.941 


m 


3.52 

30.25 


.116, 


m*  -  .0135 


(8')  M 


.53  X  6.75 

2.25  X  .0135  X  4.50  X  30  X  10« 


.87  X  10-«. 


The  same  calculation  w^  made  for  loads  of  80,000  lbs.  and  135,000 
lbs.  for  the  section  at  O,  and  similar  calculations  for  sections  at  15  in. 
from  the  middle.  Finally  a  value  of  >i  »  .90  X  10~*  was  adopted  as 
the  best  average. 

With  this  value  of  the  displacement  coefficient,  the  average  stresses 
p«,  Pt  and  were  calculated  from  formulas  (16),  (17)  and  (12')  and 
plotted  in  curves  as  shown  on  Fig.  4. 

The  values  of  ^  given  in  previous  papers*  are  much  smaller,  but  were 
based  on  tests  by  other  experimenters  which  gave  the  displacements 
of  the  bar  relative  to  the  plate  close  to  the  weld  only,  and  were  there¬ 
fore  not  strictly  applicable  to  the  theory  here  given,  which  is  based  on 
average  displacements  across  the  whole  section  of  plate  and  bar. 


III.  A  THEORETICAL  AND  EXPERIMENTAL  STl'OY  OP  THE  STRESSES  IN  A 

PLATE  REENPORCED  BY  WEBS  CONNECTED  TO  IT  BY  SPOT  WELDS 
AT  THE  ENDS 

This  study  was  made  as  a  thesis  for  the  degree  of  Master  of  Science 
by  three  students  in  the  Course  of  Naval  Construction,  R.  D.  Conrad, 
R.  A.  Hinners,  and  L.  V.  Honsinger,  all  lieutenants  j.g.  (CC)  in  the 
United  States  Navy. 

'The  thesis  is  entitled:  "Stress  Field  of  a  Plane  Plate  Reenforced  by 
a  Longitudinal  Girder  and  Subjected  to  Tension,”  and  was  prepared 
under  the  special  supervision  of  Dr.  Hencky. 

After  the  experiences  with  the  specimen  with  flat  plate  strips, 
described  above,  it  was  decided  to  use  one  of  the  type  given  in  Fig.  2 
with  webs  normal  to  the  plate,  but  instead  of  using  continuous  welds, 

'  The  Stress  Distribution  in  Welded  Overlapped  Joints — Proc.  Nat.  Ac.  1930, 
p.  «73. 


STRESS  DISTRIBUTION  IN  WELDS 


205 


the  web«  were  ounnected  to  the  plate  only  by  a  spot  weld  at  each  end 
and  one  at  the  middle  for  positioning  as  shown  in  Fig.  5.  This  con¬ 
struction  was  adopted  in  order  to  facilitate  the  mathematical  analysis. 


the  primary  aim  of  which  was  to  determine  the  stress  field.  Judging 
from  the  approximate  theory  given  in  Chapter  I,  it  seemed  likely  that 
the  stress  distribution  in  the  plate  and  webs  with  terminal  spot  welds 


WILLIAM  HOVGAARD 


would  not  be  very  different  from  that  with  continuous  welds.  The 
problem  for  a  single  force  operating  at  a  point  in  a  plane  plate  of 
infinite  extent  had  been  previously  solved,*  so  that  expressions  were 
at  once  available  for  the  component  stresses  0.,  r  and  for  the  dis¬ 
placements  u,  V,  in  the  xy  directions.  In  generalised  form  the  stresses 
are  given  by; 


TC5T  aPtCrMCM  N0.2  WITH  5P0T  WtLD5 


It  was  als«)  known  how  to  determine  the  resultant  force  R  of  the 
stresses  at  the  nucleus  of  strain  which  was  taken  as  origin. 

Now,  the  stresses  can  be  expressed  as  the  second  derivatives  of  an 
Airy  stress  function; 


so  that  this  function  can  be  determined  by  integrating  equation 
(19)  twice. 


*  Lovr:  Th.  of  Elsstirity,  3  ed.,  p.  ‘207. 

E.  Melan,  Zcit«<*h.  fUr  .^ngs’.  .Math,  und  Merb.,  1925,  pp.  324-318. 


STRESS  DISTRIBUTION  IN  WELDS 


207 


The  F-Solution 

The  8tre8R  function  was  constructed  for  two  forces  P  acting  in 
opposite  directions,  one  at  each  end  of  the  web,  A  and  F,  Fig.  5,  and 
was  found  to  be: 


-  ^{x  +  L)  log  {y*  +  (j  +  i^)*l  +  y  tan-‘  - ^ 
+  ^(x  -  L)  log  (j|/*  (x  -  L)*]  -  y  tan-'  — ^ 

X  X  —  L 


(21) 


where  C  and  D  are  constants. 

The  function  Ft  satisfies  the  biharmonic  differential  equation: 


V*F,  -  0  . (22) 

but  F !  must  be  so  modified  as  to  satisfy  at  the  same  time  the  condi¬ 
tions  existing  at  the  finite  boundaries. 

Disregarding  for  the  present  the  pull  P  of  the  testing  machine,  to 
be  dealt  with  separately  afterwards,  we  have  the  following  boundary 
conditions: 


For  the  end  boundaries  x 
For  the  side  boundaries  y 
For  all  boundaries 


'1 

±.  2L ;  ffx  —  0,  and  »  0 

±  6;  »  0,  and  —  0 

dx* 

T  -  0,  and  -  0 
dxdy 


(2:0 


Since  the  plate  is  long  and  narrow  we  can  neglect  for  the  present  the 
end  boundary  conditions;  any  residual  stresses  at  the  ends  are  easily 
handled  later.  For  the  side  boundaries  it  is  sufficient  that  the  modified 
function  Fn  shall  satisfy  the  conditions: 


Fn  -  0,  and  ^  -  0. 

Tt\o  fur.ctM  ns  Ft  and  P*  are  now  determined  so  that 

Fii  *Pf  +  Pi-l-P**0  ] 

4.  -  0  ^  y  -  ±  h . (24) 

i  dy  dy  dy  dy  J 


208 


WILLIAM  HOVGAARD 


Fi  and  Ft  were  determined  without  difficulty  so  as  to  satisfy  (23) 
and  (24)  and  the  result  was  the  following  expression; 

^^Fn~  -^(x  +  L)  log (y*  +  (j-  +  L)*l  +  ^ (x  -  L)  log  [y*  +  (x  -  L)*) 
+  I  (x  +  L)  log  Ife*  +  (x  +  m  -  (x  -  L)  log  Ife*  +  (x  -  /.)*) 


Fio.  6 


This  equation,  however,  fails  to  satisfy  the  biharmonic  equation  (22). 
An  attempt  was  made  to  construct  a  function  Ft  which,  when  added 
to  Fii,  should  make  it  satisfy  (22).  In  order  better  to  visualise  the 
problem  a  surface  was  constructed,  the  ordinates  of  which  represented 
Fit.  From  equations  (24)  it  follows  that  this  surface  is  shaped  as  if 
it  were  clamped  flat  at  the  side  boundaries  on  the  XY  plane,  but  as 
seen  from  Fig.  6  the  surface  has  a  sharp  ridge  between  the  two  nuclei, 
X  »  ±12  in.,  constituting  a  discontinuity,  which  can  only  exist  if  the 
surface  is  cut  along  this  line.  This  is  equivalent  to  an  internal  boundary, 
which  does  not  actually  exist.  The  surface  represented  by  Ft  was 
studied  by  a  graphical  and  analytical  process  analogous  to  Wieghardt’s 
method,*  which  showed  that  the  complete  determination  of  Ft  was 


*  Fdppl,  Drang  und  Zwang,  I,  p.  248. 


STRESS  DISTRIBUTION  IN  WELDS 


209 


impracticable  because  it  was  too  involved.  The  various  steps  in  the 
process,  which  is  very  interesting,  are  indicated  in  the  thesis  and 
enough  was  learned  from  the  investigation  to  indicate  that  the  addi¬ 
tion  of  Ft  to  Fti  would  not  change  the  latter  appreciably  and  that 
hence  Ft  can  probably  be  neglected. 

Another  function  Ft  was  constructed  for  the  purpose  of  annulling 
the  residual  stresses  which  in  accordance  with  the  formula  for  Fn 
exist  at  the  end  boundaries  x  =  ±2L.  These  stresses  are  determined 

1  d*Fii 

by  finding  the  value  of  for  x  »  2L  «  24  in.  On  Saint  Venant’s 

C  dy* 

principle  it  was  assumed  that  this  stress  would  soon  become  uniform 
as  we  pass  into  the  plate,  and  that  hence  it  could  be  represented  by 
the  function: 

1  F4  -  A'  (6»  -  y*)  . (26) 

This  is  independent  of  x  and  satisfies  the  boundary  conditions  (24) 
and  the  biharmonic  equation. 

Finally  a  function  Fp  was  added  to  represent  the  uniform  stress 
parallel  with  OX  produced  by  the  testing  machine.  This  function  is 
of  the  same  form  as  (26): 

^F,-//(J>*-y*)  . (27) 

Since  the  sectional  area  of  the  web  is  the  same  as  that  of  the  plate 
it  was  at  first  assumed  that  each  member  carried  one-half  of  the  load  H', 

and  thus  the  value  of  the  constant  H  should  be  -t-  Experimentally 

0 

it  was  found  however  that  the  web  took  only  3/8  of  the  load,  so  that 
the  value  of  //  became  | 

0 

Thus  the  final  stress  function  with  omission  of  Ft  becomes: 

s 

F  ^  F,  +  Fi-^  Ft  + Ft  + F,  . (28) 

which  satisfies  all  conditions  except  that  of  the  elastic  equation.  We 
refer  to  this  as  the  F-solution.  It  may  be  expected  to  give  fairly  good 
results  fur  the  stresses  except  in  the  vicinity  of  the  singular  points 
A  and  F,  since  failure  of  satisfying  the  elastic  equation  will  there  have 
the  greatest  effect.  This  suggested  the  desirabiUty  of  finding  another 
stress  function,  referred  to  in  the  following  as  the  G-solution. 


210 


WILLIAM  HOVGAARD 


The  (i-Solution 


The  function  Fi  satiRhed  the  elaittic  equation  for  the  infinite  plate 
and  waa  modified  so  a«  to  aatisfy  the  boundary  conditions  rigomusly, 
but  in  so  doing  conformity  with  the  equation  V*F  »  0  was  lost.  It  is 
now  pmposeil  to  start  from  the  same  function  F,  but  to  modify  it  in 
such  a  way  as  to  retain  rigorous  conformity  to  the  biharmonic  equation, 
while  conforming  only  approximately  to  the  boundary  conditions.  In 
this  way  a  better  result  is  obtained  for  the  important  central  portion 
of  the  plate  and  notably  in  the  vicinity  of  the  loads  P. 

The  function  Fi  is  transformed  to  a  more  convenient  non-dimensional 


form  Y  Gi,  which  still  by  differentiation  gives  the  stresses  and  which 

Ld 

still  conforms  to  the  biharmonic  e<|uation.  Now  a  corrective  function 
^  is  added  to  G #  of  such  a  nature  that : 


G//  *  G/  0  . (29) 

shall  satisfy  the  biharmonic  equation;  that  is,  ^  must  itself  satisfy  this 
equation.  The  side  boundary  conditions  are  the  same  as  for  the 
F-srdution. 


G//  ”  ®  1 

dG„  [  (approximately)  . (30) 

dy  J 

In  order  to  fulfil  these  equations  ^  is  expressed  in  a  trigonometric 
series  associated  with  hyperbolic  functions: 


-L^ 


cosh 


m-ey 

ir 


mwi  .  o 
cos  —f  -f  Pn 
4L 


mwx  .  mwx ' 

IT  ITy 


(31) 


<  >bviously  the  boundary  conditions  (30)  are  satisfied  if  0  is  such  that: 


_1_ 

CL  dy 


i'L  dy) 


for  y  ±.h 


(32) 


It  was  found  by  a  graphical  representation  of  these  equations  that 
they  could  be  satisfied  with  sufficient  accuracy  by  including  in  (31) 
terms  only  for  m  »  0,  1,  and  2.  Residual  normal  and  shearing  stresses 


STRESS  DISTRIBUTION  IN  WELDS 


211 


* 


do  indeed  remain  along  the  side  boundaries,  but  an  investigation  showed 
that  these  stresses  could  be  neglected  out  to  a  distance  from  the  middle 
of  about  X  —  20*.  The  only  exception  is  that  the  computed  values 
of  ay  must  be  corrected  for  residual  aystresses  in  the  region  abreast  of 
the  singular  points,  and  this  correction  was  determined  for  x  »  11' 
and  X  »  13'  in  an  Appendix  to  the  thesis. 

The  pull  of  the  testing  machine  was  corrected  for  as  in  the  F-solution, 
giving  a  function  Gp. 

The  complete  form  of  the  ff-function  is: 


-1-  r tan“‘  — 7-7  —  tan~'  — +  4.54  cosh  ^  cos  ^ 
L  L  X  +  L  X  —  J  \L  AL 


+  5.32  cosh 


^  r 

4L  AL 


X  .  wi 

sin 


2.61  cosh  ^  cos  ^ 


+  .509  cosh  ^  ^  ^  (5*  -  V*)  . (33) 


Mapping  of  the  Airy  Surface 

Computations  of  the  ordinates  for  both  the  F  and  the  (7-surfaces 
were  made,  but  the  F-surface  was  selected  for  plotting  by  means  of 
contour  lines  given  in  the  thesis  on  a  diagram  which  is  not  here  tepn>- 
duced.  We  refer  to  the  sketch  given  in  Fig.  6,  which,  although  less 
accurate,  makes  it  easy  to  visualize  the  peculiar  form  of  the  surface. 
The  curvatures  at  any  point  represent  the  normal  stresses,  and  the 
twist  represents  the  shearing  stresses.  The  cusp-shaped  ridge  has  the 
same  slope  ±ir  along  the  center-line  between  the  singular  points,  in 
agreement  with  the  fact  that,  since  by  symmetry  there  is  no  shearing 
stress,  there  can  be  no  twist  in  the  Airy  surface  along  this  line. 


Stress  Calculations 

In  order  to  make  a  comparison  with  the  experimental  results  and  in 
order  to  delineate  the  stress  field,  a  computation  of  the  stresses  was 
made  by  differentiation  of  the  F  as  well  as  the  G-functions,  in  accordance 
with  (20),  but  while  the  F-function  was  preferred  for  mapping  the 
surface,  the  G-function  was  found  more  suitable  for  obtaining  the 


212 


WILLIAM  HOVGAARD 


Btreasefi,  because  the  F  derivatives  were  unwieldy  for  numerical  cal¬ 
culations. 

The  stress  computations  were  carried  out  for  ail  p«>int8  of  a  network 
of  inch  squares  over  a  zone  between  x  ■>  11'  and  x  ^  13',  and  also 
for  points  across  the  transverse  center  line,  x  »  0. 

The  constants  C  and  //  were  calculated  for  a  load  distribution  between 
web  and  plate  of  3/8P  and  5'8P  respectively,  as  determined  by  the 
strain  measurements  at  the  middle.  The  reason  fur  this  uneven  distri¬ 
bution  was  probably  incipient  plasticity  in  the  end  welds. 

For  the  transverse  section  at  O  the  o.-stress  in  the  plate  is  prac¬ 
tically  uniform  and  is  close  to  the  value  of  16,670  lbs.  per  sq.  in., 
corresponding  to  five-eighths  of  the  total  load  of  the  machine,  W’  *= 
40,000  lbs. 

Experimental  H’orifc 

First  the  elastic  constants  E  and  m  were  determined  by  tests  on  a 
coupon  bar  made  of  the  same  steel  as  the  specimen.  The  result  wa.s 
E  »  30,200,000  lbs.  per  sq.  in.  and  m  *  3.64. 

By  measurements  taken  at  mid-section,  x  ^  0,  both  on  the  plate 
and  on  the  web,  at  a  test  load  of  40,000  lbs.,  it  was  found  that  25,150  lbs. 
was  transmitted  through  the  plate  and  15,200  lbs.  through  the  web, 
which,  as  stated  above,  was  in  the  ratio  of  about  5/8  to  3/8  of  the 
total  load.  That  is,  P  *  3/8W’. 

Time  did  not  permit  an  experimental  study  of  the  complete  stress 
field,  and  measurements  were  made  therefore  only  at  points  in  the 
vicinity  of  the  end  welds  where  the  greatest  irregularities  in  the  stress 
distribution  might  be  expected  to  exist.  Transverse  and  longitudinal 
strains  were  measured  on  both  faces  and  both  sides  for  the  points: 
y  «  1'  and  2'  at  x  H'  and  12';  and  for  y  »  O',  1',  2'  at  x  »  13'; 
in  all  seven  points.  At  the  points  where  y  *  1'  additional  measure¬ 
ments  were  made  in  the  45*  and  135“  directions,  giving  “rosettes”  of 
strain  from  which  the  principal  stresses  could  be  determined. 

Sufficient  data  were  thus  available  for  computing  and  for  all 
seven  points  and  in  addition  for  computing  r  and  the  direction  of  the 
principal  stresses  for  the  three  points  where  y  »  1'. 

Rubber  Model 

At  the  suggest  ion  of  Dr.  Hencky,  tests  were  made  with  a  vulcanized 
rubber  band  to  one-third  scale.  On  the  band  were  two  holes  held  by 
pins  attached  to  steel  bars  which  represented  the  webs.  At  one  end 
w’as  a  rectangular  network  of  straight  white  lines,  at  the  other  a  system 


STRESS  DISTRIBUTION  IN  WELDS 


213 


of  white  circles  were  marked  off,  the  distortions  of  which,  when  the 
band  was  stretched,  gave  indications  of  the  directions  of  the  principal 
stresses  and  of  the  shear.  This  ingenious  device  was  of  great  assistance 
in  delineating  the  stress  field,  although  it  did  not  give  quantitative 
results. 

The  Stress  Field 

The  stress  field  was  constructed  on  the  basis  of  the  stress  calculations 
and  the  experimental  results,  aided  by  the  map  of  the  Airy  surface  and 
the  results  from  the  rubber  model,  as  also  by  comparison  with  the 
stress  fields  obtained  by  photoelasticity  for  certain  similar  problems. 

Comments 

Although  this  thesis  gives  but  a  partial  solution  of  the  problem, 
dealing  only  with  the  stresses  in  the  plate  and  with  terminal  spot  welds, 
it  paves  the  way  for  further  study,  and  is  of  special  value  on  account 
of  the  complete  and  lucid  presentation.  In  particular,  it  explains  the 
use  of  the  Airy  stress  function  and  its  adjustment  to  given  boundary 
conditions.  It  discusses  very  completely  the  nature  of  the  errors  intro¬ 
duced  when  the  stress  function  fails  to  satisfy  the  biharmonic  equation, 
a  point  of  which  space  does  not  permit  a  full  discussion  in  this  abstract. 
It  explains  in  a  very  instructive  manner  how  it  proved  impracticable 
in  this  case  to  satisfy  rigorously  both  the  equation  V*F  ^  0  and  the 
boundary  conditions,  and  that  by  satisfying  one  of  these  requirements 
approximately  and  the  other  rigorously  it  was  possible  to  obtain  results 
of  practical  importance.  In  fact,  a  careful  study  of  this  thesis  will  be 
well  worth  while  for  any  student  who  wishes  to  investigate  the  stress 
distribution  in  plane  plates. 

The  thesis  shows  the  necessity,  in  a  problem  of  this  kind,  of  preparing 
the  test  piece  with  the  greatest  possible  accuracy  so  as  to  eliminate  all 
disturbing  and  complicating  factors.  All  surfaces  must  be  perfectly 
machined  and  the  dimensions  exact.  The  specimen  must  be  thoroughly 
annealed  and  after  annealing,  the  scale  must  be  removed,  preferably 
by  pickling,  since  otherwise  such  gages  as  the  Huggenberger  instru¬ 
ments  cannot  give  reliable  results. 

On  comparing  the  calculated  and  the  observed  results,  a  fair  agree¬ 
ment  was  found  in  the  longitudinal  stresses  a.,  but  discrepancies 
appeared  in  the  transverse  stresses  near  the  weld,  due  partly  to  the 
fact  that  the  analytical  solution  was  obtained  for  a  load  concentrated 
at  a  singular  point  instead  of  being  actually  distributed  over  a  finite 


214 


WILLIAM  HOVGAARD 


area,  and  partly  because  the  magnitude  of  o,  in  general  was  relatively 
small  and  therefore  diflioult  to  measure  accurately.  The  direction  of 
the  principal  stresses  at  the  few  points  where  they  were  obtained  by 
strain  measurements  showed  a  good  agreement  with  the  computed  lines 
of  stress. 

IV.  A  MORE  COMPLETE  SOLUTION  OP  THE  SAME  PROBLEM — THE  WEBS  ARE 

CONNECTED  TO  THE  PLATE  BY  CONTINUOUS  WELDS  AND  THE  STATE 
OF  STRESS  IS  DETERMINED  BOTH  FOR  THE  PLATE  AND  FOR  THE  WEBS 

This  investigation  was  carried  out  by  Dr.  Y.  C.  Yeh  in  preparation 
of  his  thesis  for  the  degree  of  Doctor  of  Science  at  the  Massachusetts 
Institute  of  Technology,  and  is  a  continuation  of  the  research  described 
in  Chapter  III.  The  thesis  is  entitled;  “The  Distribution  of  Stresses 
in  Welded  Structures"  and  is  fundamental  as  far  as  “side  welds"  or 
“longidutinal  welds"  are  concerned.  The  thesis  gives  for  the  first  time 
a  formula  for  the  shearing  stress  in  a  side  weld,  derived  from  the  general 
theory  of  the  stresses  in  plane  plates,  and  is  extended  to  comprise  the 
case  of  a  web  tapering  in  depth  towards  the  ends.  The  theoretical 
results  were  corroborated  by  a  limited  number  of  experiments. 

The  first  part  of  the  thesis  was  prepared  under  the  guidance  of 
Dr.  Hencky. 

A.  Fundamental  Assumptions  and  Mode  of  Attack 

Since  the  plate  and  the  webs  are  of  small  thickness  the  problem  was 
treated  as  one  of  generalised  plane  stress,  and  as  the  sectional  area  of 
the  weld  was  small  relative  to  that  of  the  plate  and  web,  it  was  assumed 
that  the  direct  stresses  in  it  could  be  neglected,  so  that  it  was  subject 
only  to  shearing  stresses.  The  webs  were  regarded  as  one,  and  the 
welds  were  conceived  to  merge  into  one  hypothetical  line  of  length  2L, 
belonging  both  to  the  plate  and  the  web.  Thus  the  problem  was 
reduced  from  a  three-dimensional  to  a  two-dimensional  one;  the  plate 
and  the  web  could  be  dealt  with  as  two  separate  plane  problems,  only 
it  is  required  that  the  solutions  shall  give  equal  displacements  and 
equal  but  opposite  shearing  stresses  along  the  weld  line,  which  forms 
the  internal  boundary  in  both. 

Under  these  assumptions  the  formulation  of  the  internal  boundary 
conditions  is  simplified,  and  by  symmetry  we  need  consider  only  one 
end  of  the  test  piece,  lying  on  one  side  of  the  transverse  central  plane. 
The  problem  is  then  to  explore  the  stress  distribution  in  both  the  plate 
and  the  web,  determined  primarily  by  the  longitudinal  pull  of  the 


STRESS  DISTRIBUTION  IN  WELDS 


215 


testing  machine  and  the  resistance  to  elongation  offered  by  the  web 
along  the  line  of  the  weld.  The  stress  distribution  is  throughout  influ¬ 
enced  by  the  presence  of  the  external  finite  boundaries,  but  before  these 
are  taken  into  account  it  is  first  assumed  that  both  plate  and  web  are 
of  infinite  extent  or  at  least  of  very  large  extent. 

Due  to  the  fact  that  the  shearing  forces  transmitted  through  the 
hypothetical  weld  line  theoretically  become  infinite  at  the  ends  of  the 
weld,  it  is  convenient  to  employ  elliptic  coordinates.  These  are  defined 
by  the  complex  function: 


z  ^  X  +  iy  ^  L  cosh  (a  -}-  t^)  . (34) 

which  represents  a  system  of  orthogonal  curves  in  the  xy  plane  given 
by  the  equations: 


X  B  L  cosha  cos  0 
y  ^  L  sinha  sin  ^ 

The  curves  a  »  constant  are  confocal  ellipses: 


(35) 


X* 

L*ooeh*a 


+ 


V* 

LHinh'a 


1 


(36) 


the  foci  of  which  are  (±L,  0)  being  at  the  ends  of  the  weld  line. 
The  curves  /9  «  constant  represent  a  family  of  hyperbolas. 


_ ^ _ _  1  . (37) 

y>*coeh*4  L*sinh*d 

which  are  confocal  with  and  orthogonal  to  the  ellipses.  The  eccentric 
angle  of  an  ellipse  at  the  point  of  intersection  with  a  hyperbola  is  equal 
to  ^  of  that  hyperbola.  For  positive  values  of  a  and  for  values  of  0 
from  0  to  2t  or  from  —  r  to  -|-t,  the  xy  plane  corresponds  to  a  rec¬ 
tangle  in  the  afi  plane,  infinite  in  one  direction. 

The  focal  ellipse  a  >  0  is  identical  with  the  weld  line.  Since  the 
betas  change  sign  when  the  hyperbolas  cross  this  line,  we  have  here  a 
discontinuity  or  singular  line. 

When  a  is  large,  the  ^nations  (35)  represent  a  circle  of  radius  ll2Le" 
so  that  for  an  infinite  plate  we  can  go  over  to  ordinary  polar  ccx)rdinates 
at  the  outer  boundaries,  writing: 


216 


WILLIAM  HOVGAARD 


where  0  denotes  a  direct  stress  normal  to  the  curve  indicated  by  the 
suffix  and  r  is  the  tangential  stress. 


B.  The  Boundary  Conditions 

Since  both  plate  and  web  are  for  the  present  assumed  to  be  of  large 
or  infinite  extent,  the  form  of  the  general  solution  must  be  the  same 
for  both,  but  the  plate  is  subject  to  a  longitudinal  pull  p,  while  the 
web  is  free  from  external  loads.  The  same  symbols  are  used  for  the 
web  as  for  the  plate,  but  the  former  are  distinguished  by  a  dash. 

The  boundary  conditions  for  the  plate  are  for  large  a: 


0.-p/2(H-coe2^)' 
Of  »  p/2(l  —  008 2/J)  ► 
rmt  •  ■“  p/2  sin  20 


(39) 


and  for  the  web  for  large  a: 


(40) 


Since  the  line  a  «  0  belongs  to  the  plate  as  well  as  to  the  web,  every 
point  of  it  will  receive  the  same  displacements  in  the  plate  as  in  the 
web.  Let  nnd  ut  be  the  displacements  normal  to  the  a  and  0  curves 
respectively,  and  let  m  and  v  be  simplified  notations.  We  have  then 
the  following  internal  boundary  condition: 


orr  —  r'  —  0 


(41) 


where 


h' 


_ 2 _ 

L*(coeh  2a  —  cos  20) 


('42) 


Frf)m  symmetry,  the  transverse  displacements  are  zero  at  a  =  0. 


u 


(43) 


The  shearing  stresses  in  the  plate  and  in  the  web  along  the  line 
a  »  0  must  be  equal  but  of  opposite  sign. 

(r)..,  -  -(t').-.  . (44) 


STRE88  DISTRIBUTION  IN  WELDS 


217 


C.  Determination  of  the  Streseea  and  Diapiacements 

It  was  found  that  a  solution  was  most  easily  obtained  by  the  use  of 
Airy’s  Stress  Function,  which  we  shall  call  F  for  the  plate  and  F'  for 
the  web.  Tliis  function  must  satisfy  the  equation: 

V*F  -  0  . (45) 

and  its  second  derivatives  fpve  the  stresses.  Expressed  in  elliptic 
coordinates  the  stresses  are: 


/2Y_  .  dFd(2/h')  dFd(2/h*) 

\k*)  “  \h*)  ^  90*  da  da  90  90 

_/2\  9F9i2/h')  9F  9{2/h*) 

\h*)  ^  \^*/  5o*  90  90  9a  9a 

/2\'  /2\  9' F  9F9{2/h*)  9F  9(2/ h') 

\h*/  \h*/  9a90  9a  90  90  9a 


(46) 


Let  ^  be  a  harmonic  function,  then: 


and 


(47) 


are  solutions  of  (45).  We  shall  make  use  of  the  first  of  these  equations. 
Laplace’s  operator  being  invariant,  we  have: 


0 


(48) 


the  solution  of  which  gives  ^  in  the  form: 

^  cos  n0  F  —  +  cos  (n  —  1)  <3 

-f  008  (n  +  1)  d 

where  n  is  any  constant. 

Other  solutions  are: 


(49) 


F  *  e*"*  cos  n0  and  F  »  a. 

On  this  basis  stress  functions  F  and  F'  are  built  up. 

A  general  solution  of  equation  (45)  in  elliptical  coordinates  and  the 
corresponding  expressions  for  stresses  and  displacements  have  been 
worked  out  by  Professor  Inglis*  and  by  Coker  and  Filon  in  their  work 

*  C.  E.  Inglia:  “Streaaea  in  s  Plate  Due  to  the  Presence  of  Cracks  and  Sharp 
Comers,"  Trans.  Inst.  Nav.  Arch.,  1013,  I,  pp.  210-230. 


IL 


218 


WILLIAM  HOVGAARD 


on  Photoelasticity.*  These  expressions  involve  n  arbitrary  constants, 
which  must  be  determined  from  the  boundary  conditions  of  the  prob¬ 
lem  in  question. 

In  the  present  problem  it  is  found  that  in  order  to  satisfy  conditions 
(39)  to  (44)  it  is  sufficient  to  use  five  coefficients  and  B_, 

for  the  plate,  and  three  coefficients  for  the  web.  Space  does 

not  permit  to  give  the  mathematical  development,  for  which  the  reader 
is  referred  to  the  references  given  below.  The  final  expressions  for  the 
stress  function,  the  stresses  and  the  displacements  for  the  plate  become 
respectively: 

\/L*  F/  «  i4+i  [«“*•  -1-  cos  2^1  —  i4_i  [«*•  -H  cos  20] 

4-  J  B+ 1  «“*•  cos  2^  —  B_i  o  —  §  B_i  e**  cos  20  . (50) 

<r.  (cosh  2a  —  cos  20)*  »  2A  +i(cos  40  —  4  cosh  2o  cos  20  4-  e~*“  4-  2] 

—  2i4  _|[co8  4|3  —  4  cosh  2a  cos  20  4-  e*"  +  2] 

4-  B+iI(cos  40  4-  3)e~**  —  (e~^  -b  3)  cos  20]  —  2B_i  sinh  2a 
—  B_J(cos  40  4-  3)e**  —  (e**  4-  3)  cos  20]  . (51) 

o^(cosh  2a  —  cos  20)*  »  2i4.,.i[cos  40  —  4e~*"  cos  20  4-  4-  2) 

-  2i4_,(cos  4/3  -  4e*-  cos  2^  4-  4-  2] 

—  B.,.|((cos  40  4-  3)«~**  —  (e~*"  4-  3)  cos  20]  4-  2B_i  sinh  2a 


4-  B_J(cos  40  4-  3)e*-  -  4-  3)  cos  2/3]  . (52) 

r.«(co8h  2a  —  cos  20)*  »  —4^4+1  cosh  2a  sin  20 

—  4A^i  cosh  2a  sin  20 

4-  B+i(e“**  sin  40  —  4-  3)  sin  20] 

-  2B_,  sin  20  +  B_Je*-  sin  4/3  -  (e«*  4-  3)  sin  2/3]  . (53) 

n/h  u.  «  i4+il(l  4-  r)  cos  2^  4-  (1  -  r)e-*«] 

4-  i4_,((l  4-  r)  cos  2^  4-  (1  -  r)e>-] 

4"  B.fie“*“  cos  20  4"  B_i  4"  B_je**  sin  20  . (54) 


•  E.  G.  Coker  and  L.  N.  G.  Filon:  “A  Treatise  on  Photoelaaticity,”  Cambr. 
Uni.  Press,  1031. 


STRESS  DISTRIBUTION  IN  WELDS 


219 


ti/h  Uf  »  A+i(l  —  r)  sin  2$  —  i4_i(l  —  r)  sin  2/3 
+  B+if**  sin  2/3  —  sin  2/3 
where 

1  ^  L*  (m  +  1)  ' 

K  mE 

and  ' 


(55) 


(56) 


In  order  to  obtain  the  solution  for  the  web  the  coefficients  in  equa¬ 
tions  (50)-(55)  shall  be  replaced  by  ^4'  and  B'.  ^4'  and  B'  are  so 
chosen  that  condition  (40)  is  satisfied  for  large  a.  The  stress  function 
for  the  web  is: 


1/L*  F'  A'^ I  [«-*-  -H  cos  2^1  +  k  fi+i  («■*“  cos  2^1  -  Bl.a 

. (500 

The  value  of  the  coefficients — with  E  *  30.2  X  10*  lbs.  per  sq.  in. 
and  m  —  3.64 — are  found  to  be  as  follows: 

Ail  -  +0.04585  p 
B;,  -  -0.14380  p 
Bl,  -  +0.05220  p 
i4  +  ,  -  +0.01665  p 
A. I  -  -0.06250  p 
B+,  -  +0.01880  p 
B_,  -  -0.05222  p 
B_,  -  +0. 12500  p. 


The  stresses  and  the  displacements  can  now  be  obtained  by  substi¬ 
tuting  these  values  in  equations  (51)  to  (55). 

We  are  particularly  interested  in  the  stresses  and  the  displacements 
in  the  weld.  For  a  »  0,  the  shearing  stresses  are,  from  equation  (53): 

(T.fl).-o  “  -  ?poot/3»  -  ^p  . 

and  the  displacements  are,  from  (55): 

(u#)...  -  -  0.4760  pL  B  cos  -  -  0.4760  p/Ex . (58) 


L. 


220 


WILLIAM  HOVGAARD 


Now,  the  flhearing  force  exerted  by  the  plate  at  any  point  of  the 
line  a  «  0  is  resisted  by  the  shearing  force  in  the  web  at  the  same  point, 
and  this  force  must  be  transmitted  through  the  weld.  Expressing  the 
shearing  force  in  the  weld  as  the  force  acting  per  unit  length,  that  is, 
on  the  area  y,  the  corresponding  area  of  the  plate  is  equal  to  twice  the 
thickness.  We  thus  obtain  the  following  simple  formula  for  the  shearing 
stresses  in  the  weld: 


9s 


21 

7 


4  I  X 


(59) 


which  is  zero  at  the  middle  of  the  weld  and  infinite  at  the  ends.  The 
displacements,  however,  have  finite  values  at  the  ends  of  the  weld. 

If  we  assume  the  weld  to  be  very  stiff,  i.e.  the  displacements  (u«).  _  o 
«  0  (see  equation  (41)  we  shall  have  8/7,  instead  of  4/7, 
for  the  coefficient  in  equation  (59). 

The  results  obtained  so  far  are  very  interesting  and  especially  the 
formula  for  the  shearing  stress  in  the  weld  is  of  importance,  but,  being 
based  on  the  assumption  of  large  or  infinite  extent  of  plate  and  web, 
it  remains  to  determine  the  influence  of  the  finite  boundaries  on  the 
state  of  stress. 

First  the  stresses  which  exist  along  the  finite  boundaries  when  these 
are  imagined  to  be  traced  on  the  infinite  plate  and  web,  were  calculated 
from  the  formulas  given  above.  Next,  compensating  stress  functions 
were  constructed,  which  should  annul  or  correct  these  stresses  without 
impairing  the  agreement  of  the  resulting  total  stress  function  with  the 
biharmonic  equation,  and  without  disturbing  the  equilibrium  at  the 
internal  boundary. 

It  proved,  however,  impossible  or  at  least  impracticable  to  obtain 
an  accurate  solution  when  the  boundaries  are  given  in  terms  of  x  and  y, 
but  it  was  shown  that  in  cases  where  the  plate  and  web  are  of  great 
width,  the  corrections  to  the  stresses  are  very  small.  For  narrow 
structures,  such  as  the  test  specimen  here  used,  the  method  failed. 

Hence  another  solution  was  worked  out  by  the  use  of  complex  stress 
functions,  whereby  the  finite  boundaries  could  be  more  easily  taken 
into  account. 


D.  Mi)difimtion  of  the  Formula  for  the  Shearing  Street 

The  principal  outcome  of  the  previous  investigation  is  the  formula  (59) 
for  the  shearing  stress  q,.  This  formula  will  be  somewhat  modified 
when  the  boundaries  are  finite  instead  of  being  at  an  infinite  distance 


STRESS  DISTRIBUTION  IN  WELDS 


221 


as  hitherto  assumed,  but  it  is  assumed  that  the  form  of  the  equation 
for  g,  will  remain  the  same  so  that  only  a  modification  of  the  numerical 
coefficient  4/7  is  required.  This  modification  was  made  on  the  basis 
of  experimental  results  as  follows: 

By  multiplying  the  expression  in  (59)  by  y  we  obtain  the  shearing 
force  acting  per  unit  length  of  the  weld; 


.  (60) 


and  we  have  thus  a  line  of  forces  Q.  along  the  weld  acting  both  on  the 
web  and  on  the  plate. 

The  total  load  taken  up  by  the  weld  is  therefore: 

Q~  . (61) 

Let  ll’  be  the  total  applied  load  and  b  the  half  breadth  of  the  plate, 
and  hence  W  »  2lbtp,  then: 


2L\V 

7b 


(62) 


Since  L  b  «  4  in  the  specimen,  eighth-sevenths  of  the  total  load  should 
be  transmitted  to  the  web  when  web  and  plate  are  of  infinite  extent. 
But  the  tests  showed  that  actually  with  the  given  finite  boundaries 
the  load  was  equally  distributed  between  the  web  and  the  plate. 
C'^onsequently  in  the  formula  for  g,  the  factor  4/7  should  be  changed 
to  l/'4  whereby  Q  becomes  equal  to  W/2.  In  case  of  other  structures 
a  similar  correction  to  (59)  should  be  determined  experimentally  when¬ 
ever  practicable. 

In  onler  to  fix  ideas  we  shall  in  the  following  reckon: 


and 


^  I  pi  X 

"  4T  y/U^t 


(63) 


Q. 


X 

y/L*  -  X* 


(60') 


E.  Solution  by  the  Method  of  Complex  Integration.  Infinite  Extent  of 
Plate  and  Web 

In  this  solution  we  assume  the  shearing  stresses  g,  to  be  given  by 
equation  (63)  and  reganl  the  elemental  shearing  forces  Qt  given  by 


222 


WILLIAM  HOVQAARD 


(60')  an  external  foroea  both  with  respect  to  the  plate  and  the  web. 
'Thus  we  discount  the  induenoe  at  the  internal  boundary  of  the  strew 
oorrectiona  for  the  external  boundariea.  The  internal  boundary  can 
be  diarefiarded  aa  such  and  it  becomes  easier  to  correct  for  the  residuary 
stresses  at  the  external  boundaries.  In  the  solution  for  the  infinite 
plate  and  web  the  plate  will  be  subject  to  the  load  W  and  the  Q.-forces 
along  the  weld;  the  web  will  be  subject  only  to  the  Q<-force8.  We  can 
entirely  disregard  the  web  when  we  deal  with  the  plate  and  vice  versa. 

We  shall  first  briefly  review  how  the  displacements  and  hence  the 
stresses  can  be  determined  by  complex  integration.^ 

We  start  from  the  displacement  equations  in  generalized  coordinates: 


where 


A*u  + 


m  +  I  de 
m  —  I  dx 


0 


m  +  \  de 
m  —  I  9y 


"J 


(64) 


If  we  now  write 


and 


^  du  ,  dv 


0 


2m  /  du  dv\ 

m  —  1  \dJ  dy) 


(65) 


dv  du 

'  di  dy  “ 

then  (64)  goes  over  in  the  Cauchy-Riemann  differential  equations, 
so  that: 

(-  +  -)]  +  .•  [I  - 

. (66) 


*  Love:  Th.  of  Elsat.  3rd.  ed.,  pp.  203  et  seq. 

A.  and  L.  F6ppl:  Drang  und  Zwang,  1920,  I,  p.  268. 

L.  FAppI:  “Konfonne  Abbildung  ebener  HpannunfcaiustAnde.”  Zeitarh.  fUr 
Angew.  Math,  und  Merh.,  1931,  pp.  81-92. 


STREH8  DISTRIBUTION  IN  WELDS  223 

Let  the  integral  of  (66)  be: 

jlf{z)]dz  -  F(z)  -«!>  +  »♦  (67) 

then  it  can  be  Hhown  that  the  solution  of  (64)  is: 

.  w  +  1  «l>  w  +  1  d(7 

- - 4^>*  +  2-^-n 

. (68) 

4m  4m  4m  dp  ^ 

where  i’  is  a  potential  function  which  is  to  be  so  determined  that  the 
displacements  u  and  r  shall  have  unique  values  everywhere.  From 
(68)  the  stresses  are  determined  by  standard  formulas: 


^  av*  .  d*f'\ 

<  V  8jr*  j 

’■■--iU  +  OT  +  Pr 

,4  \  dx  dxdy 


which  can  be  rewritten  in  the  following  form; 

4.  ^  j.  ^ 

a^*  4  \dy  dy  dy^ 


d'F  _E  /d>t>  d  (u 
a^*  4  \dy  di 

d'F_E/d(y^) 
dx*  “  4  \  dx 

_  «  ^Y  -  ^  - 

dxdy  4  \  dx 


a  d*f  ^ 
dx  dxdy> 


Bearing  in  mind  that 


^  ^  -  I 

dx^  dy  “  ^ 

. (70) 

d«fr  d+ 
dy  dx  j 

we  find  the  Airy  stress  function,  which  we  denote  by  Fj,,  by  integrating 
(69')  twice: 


F„  «  A74  (y'i'  +  L') 


(71) 


224 


WILLIAM  HOVGAARD 


Another  equivalent  solution  of  (64)  could  be  obtained  by  which  F  n 
is  expressed  in  x  and  4>  instead  of  y  and 
In  order  to  determine  the  various  potential  functions  here  introduced, 
we  first  find  the  form  of  the  complex  function  /(z)  on  the  basis  of  the 
known  shearing  force  Q,  acting  at  any  point  in  the  weld: 

Let  X  »  XL,  then  the  complex  stress  function  for  a  pair  of  forces 
dQ,  ac  Qjdx,  acting  upon  elements  dx  of  the  weld  line  at  distances 
:^XL  from  the  origin,  is.** 


where 


df{z) 


2\A 

L  (o*  -  X») 


A 


-  dQ,Ld\ 


2wOt 


(72) 


and  G  is  the  modulus  of  rigidity.  Qt  is  given  by  (60'). 

The  complete  stress  function  is  found  by  integration  with  respect 
to  X  from  0  to  1: 


where 


/(«)  -  ^  +  jV' 


2 


.  -  1 
LVr*  -  L*  J 


D-  J 


iKi 


.  p  (m  +  1) 
’  xL  m  ’ 


(73) 


Integration  of  (73)  gives: 

♦  4-  ^  [Vz*  -  L*  -  z]  . (74) 

'The  stress  function  ^‘(z)  becomes  infinite  only  at  the  ends  of  the  weld, 
where  also  the  shearing  stresses  are  infinite. 

'The  potential  functions  <l>  and  4^  are  best  expressed  in  elliptical 
coordinates,  as  employed  in  the  previous  solutions. 

From  (34):  z  ■■  x  +  ly  “  L  cosh  (o  -|-  i$) 

*  L  cosh  f  . (76) 

so  that  (73)  becomes: 

,  • .  xDfooshf  ,1 

+  "'J 


'  A.  A  L.  FdppI,  Dranx  und  ZwatiR,  1920,  I,  p.  273  rt  neq. 


STRE88  DISTRIBUTION  IN  WELDS 


225 


which  after  some  tranaformation  gives: 

f  +  H-  «PrMnh2a-i«iiiM  _  ,] 
2  L  cosh  2a  —  cos  2(i  J 

Equating  real  and  imaginary  parts  we  get: 


sinh  2a 


r_D _ 

2  ooeh  2a  —  cos  2d 

sin  2d 


2 


2  cosh  2a  —  cos  2d 
The  values  of  ♦  and  ♦  are  found  in  the  same  way 

tD  ,  .  ,  a  rD 

♦  »  —  L  smh  o  cos  d  —  -TT  ^ 

2  2 

♦  -  —  L  cosh  a  sm  d - 


(730 


(76) 


(77) 


It  remains  to  determine  the  potential  (’  which  was  introduced  in  (68). 
It  is  convenient  for  this  purpose  to  divide  the  functions  i' 

and  V  into  two  parts,  denoted  by  the  suffixes  (1)  and  (2)  where  the 
suffix  (I)  denotes  the  second  term  on  the  right  hand  side  of  equations 
(76)  and  (77),  and  (2)  the  first  term.  It  is  seen  that  for  system  (1) 
the  displacements  in  (68)  are  uniquely  determined  without  the  use  of 
the  last  term,  and  hence: 


*  0  or  10  ■«  constant  . (78) 

dx  dy 

Next  the  displacements  ut  and  Vt  are  expressed  in  accordance  with 
(68),  and  making  use  of  the  condition  that  (ry)«.  •  >  0  as  well  as  of 
the  equations  (70)  we  find  finally: 

d*f  0  ^  _  m  —  1  d«I»i  _  _  ~  1 

dx*  »»  +  ldx  m  -h  1  ^ 

d*l’t  ^  m  —  1  ^  m  —  1 

dy*  »n-|-ldy  m  + 

d*f'f  ^  ~  1  ^  ~  ^ 

didy  m  +  1  dy  m  +  1  * 


(79) 


226 


WILLIAM  HOVGAARD 


The  final  expremions  fur  the  Htreasee  are  now  obtained  by  substituting 
the  values  of  ^  and  U  for  the  systems  (1)  and  (2)  in  equations  (69) 
adding  them  and  carrying  out  the  differentiations: 

They  are  given  by  Dr.  Yeh  as  follows: 


tDE 

16 


L*h* 


pirn  +  1  . 

Lm  +  1  *' 


sinh  2a  -I-  L*h*  cosh  a  sin  ^(1 


!•  —  cosh  2a  008  2d  —  4  sinh*  a  oos*  d)  ~ 

^  L*)i*  sinh  2a  +  IJh*  cosh  a 

"  16  Lw  4-  1 

d  (1  —  cosh  2a  008  2d  +  4  sinh*  a  oos*  d)  1/J 
(T^)r„  -  “  L'h*\ - sin  2d  -I-  L*A«sinh  a 

"  16  L  +  I 

d  (1  ~  ooMh  2a  008  2d  +  4  oosh*  a  sin*  d)  1/J 

With  E  —  30.2  X  10*  lbs.  per  sq.  in.  and  m  —  3.64  we  obtain: 
irED  '\%  -  .o:i9« 


oos 


(80) 


This  enables  us  to  calculate  the  stresses  from  (80)  and  hence  in 
general  the  state  of  stress  for  an  infinite  plate  under  the  action  of  the 
shearing  forces  in  the  weld  given  by  (60')-  Fi^quations  (80)  hold  gcMMl 
also  for  an  infinite  web  with  the  sign  of  D  reversed. 

Table  1  gives  the  qesult  of  a  computation  of  the  stresses  from  (80), 
as  they  exist  along  a  network  of  straight  lines,  which  we  may  imagine 
to  be  drawn  on  the  infinite  plate.  The  lines  y  L  ^  \  and  x/L  —  1  and 
2  are  the  actual  boundaries  in  the  test-piece,  and  it  is  the  adjustment 
of  the  stresses  along  these  boundaries  which  fonns  the  subject  of  the 
following  two  sections. 


HTRE88  DlHTRIBrTION  IN  WELDS 


227 


TABLE  1:  Slreta  Compulatiuna  by  Bqualiont  (iO),  Fit 


\ 

¥  L 

0 

1/1 

1/4 

ll/U 

1 

U  II 

a 

-0. 1047 

-0  0768 

-0.0166 

+0  0640 

+0  0860 

1/4 

•» 

-0  0333 

-0  0442 

-0.0664 

-0  0266 

-0  0501 

0 

-fO.0682 

-M)  0764 

+0  0640 

+0.0060 

•m 

-0  1228 

-0  0»61 

-0  0336 

+0  0421 

+0  082-3 

+0.1120 

+0  0832 

3/16 

•t 

-0  0268 

-0  0361 

-0  0666 

-0  0430 

-0.0242 

-0  0170 

-0.0682 

0 

-H)  0675 

+0  0006 

+0  0806 

+0.0836 

+0.0688 

+0  0078 

-0. 1626 

-0. 1517 

-0.1184 

+0  0320 

+0.2180 

+0.2031 

+0  0878 

1/16 

-0.0088 

-0  0134 

-0  0267 

-0  0778 

-0.0178 

-0  0680 

-0  0507 

0 

+0  0803 

+0.1480 

+0  2237 

+0.1734 

+0.1264 

+0  0026 

•m 

-0.1822 

-0.1822 

-0.1822 

-0  1822 

+  » 

+0  4260 

+0  0876 

0 

•* 

0 

0 

0 

0 

—  X 

-0.1348 

-0  0108 

»•*» 

0 

+0  0825 

+0  1620 

+0  3604 

+  » 

0 

0 

Thp  atreMea  arr  found  by  multiplying  the  fartora  niven  in  thia  Table  by  p. 


F.  Correction  for  Finite  Boundaries  of  the  Plate 

From  Table  1  it  is  seen  that  the  stresses  at  the  boundaries  y  *  L/4 
and  jr  ai  2L  do  not  satisfy  the  boundary  conditions,  and  it  is  now 
attempte<i  to  construct  compensating  stress  functions,  which,  when 
superposed  upon  the  function  Fn  shall  at  least  appn)ximately  produce 
an  expression  which  fulfils  the  actual  boundary  conditions,  while  at 
the  same  time  satisfying  the  biharmonic  equation  rigorously. 

We  have  now  the  advantage  that  the  line  weld  can  be  regarded  as 
acted  upon  by  the  known  shearing  forces  as  given  by  (6O0i  since  we 
have  already  mcHlihed  that  expression  so  as  to  allow  for  the  presence 
of  the  external  Ixtundaries. 

Fn»m  Table  1  it  is  seen  that  the  stresses  across  the  boundary  x  —  2L 
are  fairly  uniform  and  that  hence  the  re(]uired  condition  of  uniform 
stress  p  can  be  easily  satisfied  by  adding  a  uniform  longitudinal  stress 
throughout  the  whole  plate,  the  magnitude  of  which  will  be  deter¬ 
mined  later.  This  corresponds  to  a  “clamped”  plate  condition  at  the 
end  boundaries,  but  the  residual  stresses  at  the  side  boundaries,  y 
L  /4,  must  be  compensated  for  or  annulleil,  since  actually  these  bound¬ 
aries  are  free  of  normal  and  shearing  stresses,  o,  and  We  should 
thus  require  two  compensating  stress  functions,  one  for  o,  and  one  for 
r,y,  but  it  was  found  that  the  residual  cystresses  were  so  small  that 
they  could  be  neglected. 


228 


WILLIAM  HOVGAARD 


The  Compenaaling  Street  Function  Xi  for  The  boundary  condi¬ 
tions  for  X  »  2L  require  that  A'l  and  dA'i/dx  shall  be  equal  to  sero. 

In  order  to  compensate  for  r,,  at  v  *  1^/4  we  must  make  take 

the  same  value  as  in  Table  1  but  of  opposite  sign,  while  (ffv)xi  ^ 
equal  to  sen>.  In  other  words,  this  function  must  give  no  stress  normal 
to  the  longitudinal  edges. 

The  stress  function  used  in  this  case  is  one  given  by  Professor  H.  M. 
Westergaard  • 


A'l 


- 


/C.tanh  A'.oos  w  i 
n*  (sinh  2A',  -I-  2A',) 


(A'.tanh  A'.cosh  w.  y 


where 


—  ir,  y  sinh  ir,  y 


(81) 


nw 

w 


A.  - 


mr 

itt' 


The  series  was  limited  to  three  terms,  n  «>  1,  3,  and  5  and  Bi,  Bt,  A* 
were  evaluated  by  appn>ximately  satisfying  the  above  conditions  at 
X  »  L/2,  X  —  A  and  x  >■  2L.  Actually  dXt/dx  is  not  xero,  but  the 
effect  of  this  discrepancy  is  negligible. 

The  Compensating  Stress  Function  A’t  for  the  Applied  Load:  The 
applied  load  W  was  assumed  to  be  uniformly  distributed  as  a  tension  p 
across  the  end  sections  of  the  plate.  The  corresponding  stress  func¬ 
tion  is: 


Xi  -  +  p/2  y* 

From  this  tension  must  now  be  subtracted  the  residual  tension  ot, 
referred  to  above,  reiQaining  from  Fn  for  the  infinite  plate  at  x  »  2L. 
Taking  the  average  of  this  tension  with  equal  distribution  of  load 
between  plate  and  web,  the  correction  is  found  to  be  —.0756  p,  8<) 
that  Xt  becomes: 

A',  -  .4622  py» 

This  gives: 

(<r.)x.  -  .9244  p  1 
«  (r»i,)x,  “  0  I 

*  H.  M.  WeHtcrassrd:  “Computation  of  Htreascs  in  a  Rridfe  Slab  Due  to 
Wheel  IxHMtfl.”  PuMir  Roads,  Vol.  11,  No.  1,  19.30. 


(82) 

(83) 


HTREHH  DISTRIBUTION  IN  WELDS  229 

The  solution  for  the  plate  is  thus  complete.  At  any  given  point  the 
stresses  can  be  determined  as  second  derivatives  of : 

^-^n  +  A', +  X,  . (84) 


In  Fig.  7  the  corresponding  state  of  stress  is  graphically  represented 


mm* 

Fig.  7 

G.  Correction  for  Finite  Boundaries  of  the  Web 

The  web  is  only  one-half  as  long  as  the  plate,  but  has  the  same  width 
and  thickness,  and  therefore  the  same  sectional  area.  The  end  bounda¬ 
ries  are  given  hy  z  ^  ±L. 

The  web  is  loaded  only  along  the  line  of  weld,  and  the  shearing  forces 
acting  along  this  line  are  of  the  same  magnitude  but  of  opposite  sign 
to  those  acting  on  the  plate. 

The  stress  function  for  an  infinite  web  is  the  same  as  that  for  an 


230 


WILLIAM  HOVGAARD 


infinite  plate,  loaded  along  the  weld  line,  except  that  the  sign  is  reversed. 
Also  the  compensating  functions  along  the  longitudinal  edges  will  be 
the  same  as  for  the  |:date,  but  of  opposite  sign.  Hence,  using  the  prime 
on  symbols  pertaining  to  the  web: 

+  . (85) 

The  end  boundaries  x  «  dbL  are  actually  free  from  stress  normal  and 
tangential  to  it,  except  at  the  end  of  the  weld,  y  ^  0,  where  the  stress 
theoretically  is  infinite.  The  first  line  in  Table  2  gives  the  residual 
stresses  along  this  edge  as  obtained  for  the  infinite  web  from  Fa. 

In  compensating  for  these  stresses  Dr.  Yeh  devised  an  ingenious 
method  which  here,  however,  can  be  described  only  in  outline. 


T.\RLE  2:  The  Boundary  Streine*  in  the  Web  at  x  ^ 


t/h 

0 

1/4 

I/s 

1/4 

1 

</p 

Residual 

—  oe 

-0.2580 

-0. 1756 

-0.1280 

-0.1134 

e:/P 

Equ.  86 

(-H)  3634] 

-H).2540 

-H).1750 

-H).i2go 

+0.1134 

First,  the  following  approximate  expression  was  constructed  to  repre¬ 
sent  the  residual  normal  tractions  at  x  L  given  in  the  first  line 
of  Table  2: 

“  -  p[^.1134  -I-  .2500  -f-  a  concentrated 

force  (- .0636 W')  . (86) 

where  6  *  L/4  is  one-half  the  width  of  the  web. 

The  first  part  of  this  expression,  the  bracket  term,  represents  fairly 
the  stresses  except  at  the  point  x  «  L,  p  »  0  where  should  be  infinite. 
This  term  is  represented  in  Fig.  8  by  a  curve,  the  area  of  which  falls 
short  of  the  total  pull  of  the  residual  stresses  by  an  amount  .0636ir. 
This  amount  is  represented  by  a  concentrated  force  supposed  to  be 
acting  at  the  end  of  the  weld  line. 

Three  stress  functions  are  employed  to  compensate  for  this  peculiar 
stress  distribution. 

The  first  term  inside  brackets  in  (86)  represents  a  uniform  com¬ 
pression,  which  is  taken  care  of  by  the  function 

X{  -  -I- .0567  py* 


(87) 


STRE8H  DISTRIBUTION  IN  WELDS 


231 


giving  the  stresses: 


-  +  .1134  p  ] 
“  (r»,)x;  *  0  I 


(88) 


The  last  term  inside  brackets  is  provided  for  by  a  stress  function 
Xi  which  consists  of  one  term  yielding  the  required  stresses  at  the  end 
boundaries,  and  a  series  of  terms  containing  arbitrary  constants,  so 
adjusted  as  to  give  no  stresses  at  any  of  the  boundaries,  but  leaving  one 
arbitrary  constant.  The  latter  constant  is  now  so  determined  as  to 
make  the  potential  energy  a  minimum,'*  but  no  attempt  is  made  to 
satisfy  the  elastic  equation  rigorously.  In  order  to  simplify,  Poisson's 
ratio  is  assumed  to  be  zero  (m  »  «). 


RstnuM.  Smcssss  mt  tmc  Cnos  or  tms  Wn 


Fio.  8 


The  concentrated  force  is  developed  into  a  Fourier’s  series  and  the 
corresponding  stress  function  Xj  is  found  by  a  method  given  by 
Dr.  F.  Bleich.“ 

The  tangential  forces  at  the  end  of  the  web  were  also  compensated 
for  in  the  thesis,  but  it  was  found  that  this  correction  had  only  a 
negligible  effect  on  the  stresses  in  the  web. 

The  corrective  stresses  determined  from  the  sum  ofXJ  d-XJ  +  XJ 
are  given  in  the  second  line  of  Table  2. 

The  stresses  in  the  web  can  now  be  found  by  differentiation  of; 

r  -  F{,  +  x;  +  x;  +  x;  . (89) 

and  are  given  by  curves  in  Fig.  7. 

'*  Fdppl:  Drang  und  Zwang,  I,  1920,  pp.  323  et.  aeq. 

C.  Timoshenko:  "An  Approximate  Solution  of  Two-dimenaional  Problema  in 
Elaaticity."  Phil.  Mag.  Ser.  6,  Vol.  47,  1924,  p.  1095. 

••  F.  Bleirh:  “Der  gerade  Stab  mit  Reehteek — Queraehnitt  ala  ebenea  Prob¬ 
lem,"  Der  Bauingenieur,  1921,  p.  255. 


232 


WILLIAM  HOVGAARD 


H.  The  Displacement  Coefficient 

As  defined  in  Chapter  I  the  displacement  coefficient  is  equal  to  the 
average  linear  displacement  of  a  transverse  section  of  the  web  relative 
to  that  of  the  same  section  of  the  plate  divided  by  the  shearing  stress 
in  the  weld  at  that  section. 


From  the  computations  given  or  referred  to  above,  the  stresses  and 
hence  the  strain  at  any  point  is  known  both  in  the  plate  and  in  the  web 
so  that  the  displacements  can  be  calculated  by  a  simple  process  of 
integration.  The  shearing  stress  being  known  from  (63)  we  can  then 
find  fi  for  any  point  x  in  the  weld,  and  it  is  now  possible  to  examine  in 
how  far  the  assumption  adopted  in  Chapter  I  of  the  constancy  of  m  in 
a  given  weld  is  confirmed  by  the  present  anal3rsis.  The  procedure  is 
as  follows.  * 

We  start  from  the  fundamental  equation  for  generalised  plane  stress; 

2  (..  +  <-,)  . (91) 

ox  m  -f-  1 

where  n  is  the  longitudinal  disfdaoement  at  any  point  in  the  plate. 
Neglecting  o,  which  is  small  relative  to  a,  except  in  the  closest  vicinity 
to  the  ends  of  the  weld,  we  find  the  average  value  of  n  from: 

^  J (<r«)»Tg.  ^  ^  jJ  dx  »  ^  y (<rs)art.  dx  . (92) 

Similarly  for  the  web,  and  we  find  thus  the  average  displacement  of  the 
web  relative  to  that<of  the  plate: 

l'.  -  j  r  [(«•)....  -  <lx-  lj\p.-p.]dx  . (93) 

re-introducing  the  symbols  p.  for  the  average  stress  in  the  plate  and 
p,  for  the  bar  as  used  in  Chapter  I. 

From  the  conditions  of  equilibrium  we  have  from  Chapter  I: 

(1)  p,  -  p  -  I  y  q/lx 

(2)  ^  y  9,dx. 


STRESS  DISTRIBUTION  IN  WELDS 


233 


We  have  also  (63): 


.  1 

9,  “  J  -  P 


y  y/L*  -  j* 

From  the  last  four  equations  we  find,  since  a  ^  A 


.(94) 


2bt  -  U  2 


“  T  {  £  "  *  [f.  >  -  (0*  +  ““"(l)]}  . 

A«RAOc  amtascs  a  ota^LACCMKNT  coc^  fj 


According  to  this  analysis  n  »  U,/q,  is  not  constant  for  a  given  joint, 
as  evident  from  the  fact  that  at  the  end  of  the  weld  where  x  »  L, 
Ug  is  finite  while  q,  is  infinite,  so  that  n  is  zero  at  that  point. 

The  values  of  fi,  calculated  for  W  *  30,000  lbs.,  A  »■  a  —  1.5  sq.  in., 
p  «  20,000  lbs.  per  sq.  in.,  E  «  30,200,000  X  10*  lbs.  per  sq.  in., 
L  «  12  in.,  are  plotted  in  a  curve  in  Fig.  9.  This  curve  rises  from 
zero  at  the  middle  to  a  maximum  of  .327  X  10~*  near  the  end  of  the 
weld,  after  which  it  drops  very  abruptly  towards  the  end.  The  average 
value  of  M  is  equal  to  .158  X  10~*.  The  dotted  curve  for  m  is  obtained 
experimentally  as  described  in  Chapter  V,  It  is  reproduced  in  Fig.  12. 


234 


WILLIAM  HOVGAARD 


Two  other  sets  of  curves  are  shown  in  Fig.  9,  one  for  p,  and  one 
for  p,,  each  comprising  three  curves,  one  calculated  from  (1)  and  (2) 
using  Dr.  Yeh’s  value  for  9.  given  in  (94);  another  from  (16)  and  (17) 
using  the  average  value  m  *  *168  X  10~*,  and  a  third  curve  based 
directly  on  experimental  results  as  described  in  the  following  chapter. 
A  close  correspondence  is  found  between  the  three  curves  in  each  set. 

/.  Tapering  the  Web 

l^lien  reenforcing  girders,  unbracketed  at  the  ends,  are  fitted  00 
plated  structures  it  is  customary  in  practice  to  taper  the  ends,  partly 
because  it  is  believed  that  the  outstanding  futiis  at  the  ends  of  such 
girders  are  inactive,  partly  because  it  is  thought  that  thereby  the  end 
connections,  whether  rivets  or  longitudinal  welds,  will  be  partly  relieved. 
The  web  in  the  specimen  here  under  consideration  may  be  regarded  as 
a  reenforcing  girder  and  it  is  of  interest  to  examine  whether  tapering  of 
this  web  at  the  ends  does  really  produce  a  better  distribution  of  the 
stresses.  This  question  has  not  to  the  author's  knowledge  been  studied 
before  in  a  rigorous  manner,  but  Dr.  Yeh  in  his  thesis  has  investigated 
it  theoretically  by  assuming  the  web  to  be  bounded  by  an  ellipse  of 
nearly  the  same  major  axis  and  exactly  the  same  minor  axis  as  the 
length  and  the  breadth  respectively  of  the  rectangular  web  actually 
fitted  in  the  specimen.  This  problem  is  relatively  simple,  since  the 
solutions  given  above  are  based  on  elliptical  coordinates,  and  we  need 
only  choose  the  appropriate  value  of  the  bounding  coordinate  a. 

The  external  boundary  of  the  web  is  then  defined  by: 

3“*  .  y'  „  y 

L*cosh*a,  L,Binh*a,  . ^  ^ 

By  taking  ai  —  .‘2475  we  have  the  semi-major  axis  1.03L  or  prac¬ 
tically  12',  and  the  semi  minor  axis  L/4  »  3'. 

Since  there  are  no  normal  or  tangential  stresses  at  the  boundary  ai 
we  must  have: 

o  *  0  and  »  0  . (97) 

The  plate  is  assumed  to  be  of  infinite  or  large  extent  in  all  directions 
and  subject  to  a  longitudinal  pull  p  at  the  boundary,  which,  as  in 
Chapter  IV,  A,  is  conceived  as  a  circle  of  very  large  diameter.  The 
external  boundary  conditions  of  the  plate  are  therefore  the  same  as  in 
Chapter  IV,  B,  defined  by  equations  (39).  The  internal  boundary  is 
again  formed  by  the  line  weld  a«  »  0  and  relations  between  the  dis- 


STRESS  DISTRIBUTION  IN  WELDS 


235 


placements  and  shearing  stresses  are  the  same  as  given  by  (41) 
(43)  and  (44). 

The  solution  for  the  displacements  and  the  stresses  follows  the  same 
line  as  in  ('hapter  IV',  C,  and  yields  the  longitudinal  displacement  in 
the  weld. 

-  •6250pj'/£:  . (98) 

and  the  shearing  stress  in  the  weld: 


9. 


.3118^ 

7 


(99) 


These  equations  are  of  the  same  form  as  (58)  and  (59)  for  the  rec¬ 
tangular  web,  but  differ  in  the  value  of  the  coefficients.  The  shearing 
stresses  are  reduced  by  nearly  50  percent,  while  the  displacements  of 
the  weld  are  increased  by  about  30  percent. 

Although  the  actual  boundary  conditions  of  the  plate  are  not  those 
here  assumed,  we  may  conclude  that  by  tapering  of  the  web,  the  weld 
becomes  more  yielding,  but  the  shearing  stress  at  the  ends  of  the  weld 
remains  inhnite  according  to  this  theory. 

It  is  of  interest  to  note  .that  the  effect  of  the  taper  here  assumed  is 
the  same  as  if  the  modulus  of  elasticity  of  plate  and  web  was  lowered 
to  about  23,000,000  lbs.  per  sq.  in. 


7.  Summary  of  Dr.  Yeh’s  Thesis 

The  expression  for  the  shearing  stress  in  the  weld  was  found  by 
several  different  methods  to  be  of  the  form: 


9. 


K 


X 

y/U-7^ 


(100) 


where  K  is  a  coefficient,  which  is  constant  for  a  given  joint;  but  depends 
upon  the  dimensions  and  form  of  the  structural  members  and  upon  the 
elastic  properties  of  the  weld  as  well  as  of  the  plate  and  web. 

Let  t  and  t'  be  the  thicknesses  of  plate  and  weld  respectively. 


p  n  the  longitudinal  stress  due  to  the  applied  load 
y  »  the  aggregate  sectional  area  of  the  throat  of  the  welds  per  unit 
length 

K  ^  the  modulus  of  elasticity  of  the  metal  of  plate  and  web 
E'  *=  the  modulus  of  elasticity  of  the  welding  metal 
1  _  1 
m  *  3.64 


Poisson’s  ratio 


236 


WILLIAM  HOVGAARD 


Then  if  the  plate  and  web  are  of  so  lar^e  extent  that  their  form  does 
not  affect  the  stresses  in  the  weld,  K  is  given  by  Dr.  Yeh  as: 


WTien  E 


K 

E'  and  t 


p  2a'  2E' 

'^7(1+0  (A'  +  r) 

f 


(101) 


K  -  4/7  pI/7 


(102) 


It  was  found  by  experiments  with  the  test  piece  to  which  the  theory 
was  applied,  that  the  factor  in  (102)  should  be  multiplied  by  7/16  in 
order  to  bring  about  agreement  between  the  theory  and  the  tests,  a 
correction  which  is  probably  necessitated  by  the  presence  of  the  finite 
boundaries.  The  formula  for  the  shearing  stress  becomes  in  that  case: 


X 


(103) 


The  lower  the  value  of  E*  is,  the  more  yielding  will  be  the  weld  and 
the  less  will  be  the  load  which  is  transferred  to  the  web.  ()n  the  other 
hand,  if  E'  is  very  great,  the  factor  4/7  in  (102)  will  be  replaced  by  8/7. 
This  is  the  case  of  a  rigid  weld. 

2.  The  distribution  of  stresses  across  any  transverse  section  of  the 
|ireb  and  of  the  plate  is  fairly  uniform,  except  near  the  ends  of  the 
weld,  provided  the  members  are  fairly  narrow  as  in  the  specimen  here 
under  consideration.  As  seen  from  Fig.  9  the  curves  for  the  average 
stresses  in  the  web  and  in  the  plate,  and  p^,  as  found  by  the  author’s 
theory  in  Chapter  I,  agree  quite  closely  with  those  obtained  from 
Dr.  Yeh’s  theory  and  from  experimentation. 

3.  The  displacement  coefficient  fi  by  which,  according  to  the  author’s 
theory,  the  shearing  stress  in  the  weld  at  any  point  x  shall  be  multiplied 
in  order  to  get  the  average  linear  displacement  of  a  section  of  the  web 
relative  to  the  corresponding  section  of  the  plate  at  that  point,  was 
found  in  Dr.  Yeh’s  analysis  to  be  very  variable,  being  a  complicated 
function  of  x  as  seen  from  the  curve  in  Fig.  9.  Its  average  value 
.158  X  10~*  gives  an  equal  distribution  of  the  machine  load  between 
the  web  and  the  plate  as  it  should  do.  Dr.  Yeh’s  analysis  gives  infinite 
stresses  at  the  ends  of  the  weld,  while  the  assumption  of  a  fixed  value 
of  A>  as  used  by  the  author,  gives  finite  shearing  stresses  at  those  points. 

4.  Tapering  of  the  web  was  found  to  make  the  weld  more  yielding. 


STRESS  DISTRIBUTION  IN  WELDS 


237 


By  assuming  the  web  to  be  elliptical  instead  of  rectangular,  but  of  the 
same  principal  dimensions,  the  value  for  K  with  a  plate  of  infinite 
extent,  was  re<luced  to  little  more  than  one-half  of  the  value  given 
in  (102).  It  seems  advantageous  therefore  to  taper  the  ends  of  reen¬ 
forcing  girders  as  actually  done  quite  generally  in  practice. 

V.  AN  EXPERIMENTAL  STUDY  OF  THE  STRESSES  IN  THE  SPECIMEN 
REFERRED  TO  IN  CHAPTER  IV 

This  Study  was  made  as  a  thesis  for  the  degree  of  Master  of  Science 
by  a  student  in  the  ('ourse  of  Naval  Construction  A.  M.  Zollars, 
Lieutenant  (jg)  (CC)  of  the  Unitetl  States  Navy,  together  with  E.  P. 
Worthen  of  the  C'ourse  of  Naval  Architecture.  The  thesis  is  entitled: 
“An  pAperimental  Determination  of  the  Distribution  of  Longitudinal 
Shearing  Stress  in  a  Continuous  Weld.” 

The  primary  object  was  to  obtain  by  strain  mea.surements  a  picture 
of  the  stress  field  in  the  plate  and  webs  of  the  test  specimen  for  which, 
as  described  in  Chapter  IV,  Dr.  Yeh  had  obtained  a  theoretical  solu¬ 
tion.  Dr.  Yeh  cixiperated  in  this  experimental  work,  which  in  fact 
may  be  regarded  as  supplemental  to  his  thesis. 

Rosettes  of  strain  measurements  were  taken  at  each  point  selected 
for  observation,  one  in  the  longitudinal  or  X-direction,  one  in  the  trans¬ 
verse  or  }’-direction  and  two  at  45“  to  these.  Three  such  strains  are 
indeeil  sufficient  for  the  determination  of  the  principal  stresses  in  magni¬ 
tude  and  direction,  but  the  fourth  served  as  a  check.  The  strains  were 
measured  with  Huggenberger  tensometers  as  shown  in  Fig.  10  at  the 
points  indicated  on  Fig.  13,  located  on  sections  at  the  middle  and  at 
3',  6",  9',  1 1 ",  12',  and  13'  from  the  middle  on  one  end  of  the  specimen. 
For  the  12'  and  13'  sections  measurements  were  taken  on  the  plate  only. 

From  the  strains  so  obtained  were  calculated  the  longitudinal  and 
transverse  stresses  as  well  as  the  magnitude  and  directions  of  the 
principal  stre-sses,  making  use  of  known  formulas.'* 

The  load  for  which  the  analysis  was  carried  out  was  W  —  30,000  lbs., 
which,  since  the  sectional  area  of  the  plate  was  A  =  6'  X  i'  =  1.5 
sq.  in.,  gave  an  average  stress  in  the  plate  outside  the  web  of  p  =  20,000 
lbs.  per  sq.  in.  The  area  of  the  web  was  equal  to  that  of  the  plate 
a  =  2  X  3  X  i'  =  1.5  sq.  in.  not  including  the  plate  between  the  two 
parts  of  the  web.  The  aggregate  trarnsverse  sectional  area  of  the  four 
fillet  welds  was  J  .sq.  in.,  the  welds  being  J'  deep  along  the  sides.  The 

*•  HovKaard;  “Detemiination  of  Strosaes  in  Platinx  from  Strain  Measure¬ 
ments,”  Soo.  Nav.  Arch.  A  .Mar.  En*.  New  York,  1931. 


238 


WILLIAM  HOVdAARI) 


STRESS  DISTRIBUTION  IN  WELDS 


241 


Referring  now  to  Fig.  11  the  longitudinal  stress  was  plotted  for  the 
various  transverse  sections  where  measurements  were  taken.  The  area 
subtended  by  such  a  curve,  divided  by  the  width  of  the  plate  or  the 
web  gave  the  average  stress  for  each  section.  It  was  found  that  at 
mid  section  the  stresses  were  practically  uniform  across  the  whole  width 
of  the  plate  or  web,  and  were  of  about  the  same  magnitude  in  both. 
This  confirms  the  assumption  made  in  Dr.  Yeh’s  thesis  that  the  load 
is  equally  distributed  between  plate  and  web. 

The  stress  curves  for  the  transverse  sections  of  the  web  have  a 
maximum  at  the  centre  line,  which  becomes  more  pronounced  as  the 
end  of  the  web  is  approached.  The  corresponding  stress  curves  for  the 
plate,  on  the  other  hand,  have  a  minimum  at  the  centre  line  except  at 
the  end  of  the  weld,  where  it  has  a  sharp  maximum.  At  a  short  dis¬ 
tance  beyond  the  end  of  the  weld,  however,  this  maximum  falls  off. 

As  shown  in  Fig.  12,  curves  were  plotted  on  a  longitudinal  base  for 
the  average  longitudinal  stress  p.  on  the  plate  and  pt  on  the  web.  The 
slope  of  the  curve  for  the  plate  is  approximately  equal  to  the  negative 
of  the  slope  for  the  web  as  it  should  be,  since  the  pull  on  the  plate  must 
increase  at  the  same  rate  as  the  pull  on  the  web  decreases.  The  rates 
of  increase  or  decrease  in  the  pull  must  evidently  be  equal  to  the 
shearing  forces  in  the  weld,  whence: 


Y9.  *  Qm- 


Since  in  this  case  A  »  a,  the  shearing  force  in  the  weld  at  any  point 
T  could  be  obtained  by  multiplying  the  average  of  the  numerical  values 
of  the  slope  of  the  two  curves  at  that  point  by  A  »  1.5  sq.  in.  Only 
at  the  end  of  the  weld  was  it  found  impossible  to  determine  the  slope 
of  the  curves  accurately  and  thus  the  ending  of  the  curve  for  Q,  remained 
uncertain.  In  order  to  find  this  terminal  value  approximately,  it  was 
argued  that  the  pull  represented  by  the  total  area  of  the  curve  for  Q„ 
that  is,  the  total  shearing  force  transmitted  by  the  weld,  must  be  equal 
to  the  total  pull  on  the  web  at  mid-section.  The  latter  force  was  easily 
calculated  from  the  strain  measurements  at  that  section,  and  was  found 
to  be  equal  to  15,000  lbs.  Now  the  cur>’e  for  Q*  was  so  adjusted  that 
the  subtended  area  had  the  corresponding  value.  This  required  a  small 
modification  of  the  curve. 

From  the  curve  for  Q,  the  curve  for  the  shearing  stress  per  square 
inch,  q.,  was  then  obtained  by  dividing  it  with  y  —  .707  sq.  in.  per 
inch  run.,  Both  curves  are  shown  in  Fig.  12. 


k 


STRESS  DISTRIBUTION  IN  WELDS 


243 


The  area  under  the  curves  for  p,  and  p,  up  to  any  point  x  is  propor¬ 
tional  to  the  displacements  of  plate  and  bar  respectively,  relative  to 
mid-section.  It  follows  that  the  area  between  the  two  curves  repre¬ 
sents  the  displacement  of  the  plate  relative  to  the  web,  denoted  above 
by  U„  the  value  of  which  is  obtained  by  dividing  this  area  by  the  modu¬ 
lus  of  elasticity  E  in  accordance  with  (93)  A  curve  was  drawn  for  (/.. 

SHEARING-  STRESS  IN  WELD, 


Now  the  displacement  coefficient  n  could  be  obtained  for  any  section 
from  the  equation: 

M  *  V,Iq. 

The  value  of  m  is  plotted  in  Fig.  12.  While  the  theoretical  value  of  m 
at  mid-section  is  obviously  0/0,  the  experimental  curve  indicates  a  defi¬ 
nite  value  at  that  point  and  is  drawn  so  as  to  run  to  a  horizontal  tangent. 
The  average  experimental  value  of  ft  over  the  whole  length  of  the  weld 


244 


WILLIAM  HOVGAARD 


was  found  to  be  .224  X  10~*,  while  the  theoretical  value  given  by 
Dr.  Yeh  is  .168  X  10“*  as  stated  above. 

Fig.  13  shows  the  lines  of  tensile  and  compressive  principal  stresses. 
Time  did  not  permit  a  calculation  of  their  magnitudes. 

Fig.  14  shows  various  curves  for  the  shearing  stress.  One  curve 
results  directly  from  the  experiments.  One  is  calculated  from  the 
formula  (12')  in  Chapter  I,  assuming  the  experimental  value  n  =  .224 
X  10~*  and  another  from  the  same  formula  assuming  the  theoretical 
value  M  *=  -158  X  10“*.  One  curve  is  drawn  representing  Dr.  Yeh’s 
purely  theoretical  curve,  formula  (103).  The  maximum  shearing  stress 
qi  in  the  four  cases  is  respectively  11,600,  7900,  9500  lbs.  per  sq.  in. 
and  infinity.  Finally  a  curve  is  drawn  terminating  at  the  yield  point 
of  the  welding  material  in  shearing,  as  explained  in  Chapter  VI. 


VI.  CONCLUDING  REMARKS 

The  investigation  began  with  an  approximate  solution  based  on  the 
preliminary  and  tentative  assumption  that  in  a  given  longitudinal  weld 
the  average  displacement  of  the  connected  members  relative  to  each 
other  bears  a  constant  ratio  to  the  shearing  stress  is  constant  thnmgh- 
out  the  weld.  This  led  to  the  formula  for  the  shearing  stress 


p  sinh  mi. 
mnE  cosh  mL 


(12') 


according  to  which  this  stress  increases  slowly  fn>m  zen)  at  the  middle 
of  the  weld  until  it  runs  very  steeply  to  a  maximum  at  the  ends.  This 
characteristic  law  of  variation  was  verified  in  a  general  way  experi¬ 
mentally,  but  the  solution  was  admittedly  empirical  and  it  gave  no 
information  of  the  stress  distribution  in  the  members  connected  by 
the  weld.  ' 

In  order  to  obtain  a  more  accurate  picture  of  the  state  of  stress,  it 
was  necessary  to  apply  a  more  rigorous  theoretical  anal3r8i8,  and  the 
first  attempt  in  this  direction  was  made  on  the  structure  illustrated  in 
Fig.  5  where  two  webs,  placed  normal  to  a  plate,  were  connected  to  it 
by  spot  welds  at  the  ends  and  one  spot  weld  at  the  middle.  The  stress 
field  for  the  plate  was  calculated  and  mappeil  out  and  a  limited  number 
of  strain  measurements  were  made  at  a  test  load  of  40,000  lbs.  It  was 
found  that  three-eighths  of  the  total  load  was  transmitted  to  the  web 
under  these  circumstances,  and  a  fair  agreement  was  found  between  the 
observed  and  calculated  values  of  the  longitudinal  stresses.  Theo¬ 
retically  it  proved  impossible  to  satisfy  in  a  rigorous  manner  both  the 
biharmonic  equation  V*F  »=  0  and  the  boundary  conditions. 


STRESS  DISTRIBUTION  IN  WELDS 


245 


A  more  complete  solution  was  worked  out  by  Dr.  Yeh,  who  dealt 
with  the  same  structure,  but  the  weld  extended  throughout  the  entire 
length  of  the  web.  His  first  and  most  important  result  was  a  formula 
for  the  shearing  stress  in  the  weld: 


This  equation  makes  the  shearing  stress  infinite  at  the  ends  of  the 
weld,  but  otherwise  the  curve  is  not  very  different  from  that  obtained 
by  formula  (12'). 

By  the  method  of  complex  integration  Dr.  Yeh  found  a  stress  func¬ 
tion  under  the  assumption  that  both  plate  and  webs  are  of  infinite 
extent,  and  connected  only  along  the  weld.  This  stress  function,  which 
satisfies  the  biharmonic  equation,  was  then  modified  by  the  addition 
of  other  functions  so  constructed  as  to  fulfil  the  boundary  conditions 
approximately,  while  strictly  preserving  the  agreement  with  the  bihar¬ 
monic  equation.  The  shearing  stresses  along  the  weld  were  assumed 
to  conform  to  the  formula  just  given  with  an  experimentally  determined 
value  of  K.  In  this  way  the  correction  for  the  residual  stresses  along 
the  external  boundaries  was  much  simplified  and  the  state  of  stress  was 
completely  determined  both  for  the  plate  and  for  the  web. 

With  the  experimental  value  of  K,  equation  (100)  becomes: 


pi  X 

VL*  -  X* 


(las) 


Let  us  now  compare  the  two  formulas  (12')  and  (103).  The  average 
value  of  M  calculated  on  the  basis  of  Dr.  Yeh’s  theory  was  .158  X  10"*, 
while  subsequent  experiments  gave  n  »  .224  X  10"*,  and  an  inspection 
of  Fig.  14  shows  that  when  the  latter  value  is  applied  to  equation  (12') 
a  curve  for  the  shearing  stress  is  obtained  which  falls  very  close  to  the 
experimental  curve.  If  then  we  have  the  means  of  estimating  the 
value  of  M  in  any  given  case,  formula  (12')  will  probably  serve  for  all 
practical  purposes. 

The  curve  representing  (103)  in  Fig.  14  falls  considerably  below  the 
experimental  curve  in  the  outer  one-half  of  the  weld,  but  finally  it 
rises  steeply  above  that  curve  and  goes  up  to  infinity  at  the  end. 
Evidently  the  stress  cannot  exceed  the  yield  point,  but  if  we  reduce  the 
terminal  stress  to  that  at  the  yield  point,  say  qt,  in  the  curve  for  q, 
given  by  Dr.  Yeh,  we  must  at  the  same  time  increase  the  ordinates 


246 


WILLIAM  HOVGAARD 


elsewhere.  This  follows  because  the  area  subtended  by  the  curve  for 
Q,  which  is  proportional  to  that  for  shall  represent  to  proper  scale 
the  total  pull  of  the  weld,  which  must  be  equal  to  the  known  pull  on 
the  web  at  mid'eection.  By  this  modification  Dr.  Yeh’s  curve  is  made 
to  come  closer  to  the  experimental  curve.  In  order  to  make  the  curve 
for  q,  fulfil  those  two  conditions,  let  us  write  (103)  in  the  form: 


pi  rx 

’*“47  Vk*L*  -  X* 


(104) 


where  r  and  k  are  new  coefficients.  In  order  to  make  q,  »  9*  when 
X  L,  we  must  have: 


9* 


ptr 


(106) 


and  in  order  to  make  the  area  of  the  curve  for  Q,  equal  to  the  pull  on 
the  web  at  mid  section — in  the  present  case  W/2—  we  must  have: 


2  “7. 


Qjix 


-  X* 


therefore; 


W  -  -  Vit*  -  1] 

46 


(106) 


In  the  specimen  here  under  consideration 
-  26pl  -  Lpl/2. 


Hence: 


1 

[k-Vk*  -  11 


(107) 


From  (106): 


9* 


_ Pf _ 

47  [k-  y/k*-  11 


. (108) 


This  equation  can  be  solved  by  graphical  construction,  plotting  the 
expression  on  the  right-hand  side  for  various  values  of  k,  and  then 
finding  k  for  that  ordinate  which  is  equal  to  the  known  value  of  9*. 
After  that  r  can  be  found  from  (107). 


8TRE88  DISTRIBUTION  IN  WELDS 


247 


We  are  thus  able  to  construct  a  curve  for  q,  which  fulfils  the  required 
conditions. 

Apply  this  method  to  the  test  specimen  for  a  load  of  W  ^  30,000  lbs., 
b  —  3",  f  —  “  IF/2b<  *  20,000  lbs.  per  sq.  in.  Take  the  yield 

point  of  the  material  of  the  weld  in  shearinn  equal  to  one-half  of  the 
yield  point  in  tension,  which  is  estimated  at  36,000  lbs.  per  sq.  in. 
Thus  9*  «  18,000  lbs.  per  sq.  in.  The  shearing  area  of  the  weld  is 
y  —  .707  sq.  in.  per  in.,  and  L  »  12'. 

From  (108)  it  is  found  graphically  that  k  »  1.006  makes  qk  ^  18,000 
lbs.  per  sq.  in.,  and  substituting  this  value  of  k  in  (107)  we  find  r  »  1.115. 

Substituting  in  (104)  we  find  q,  for  various  values  of  z  and  are  thus 
able  to  plot  a  curve,  which  is  given  in  Fig.  14,  representing  Dr.  Yeh’s 
formula  as  modified,  so  as  to  terminate  at  the  yield  point  of  the  ma¬ 
terial.  This  curve,  which  is  drawn  in  a  heavy  full  line,  is  seen  to  follow 
the  experimental  curve  very  closely  up  to  about  x  *  8^',  where  the 


Fig.  15 


curves  intersect.  After  that  it  falls  somewhat  below  the  experimental 
curve,  until  close  to  the  end  where  it  again  rises  above  that  curve. 

Even  a  considerable  variation  in  the  assumed  yield  point,  makes  but 
very  little  change  in  the  curve  for  q,  except  very  close  to  the  termi¬ 
nal  point. 

Other  cases,  where  the  load  transmitted  to  the  web  has  a  different 
value  and  where  therefore  also  K  of  (100)  is  different,  can  be  dealt 
with  in  the  same  manner,  but  in  any  case  the  method  requires  an 
experimental  estimate  of  the  load  transmitted  to  the  web  at  mid-section. 

£>r.  Yeh  worked  out  a  solution  for  a  tapered  web,  indicating  a  marked 
reduction  in  the  shearing  stresses  in  the  weld.  It  is  intended  in  the 
coming  year  to  check  this  formula  experimentally  by  strain  measure¬ 
ments  on  the  test  piece  used  by  Dr.  Yeh  after  appropriate  changes 
have  been  made  in  the  form  of  the  webs. 

It  is  of  interest  to  note  that  Dr.  Yeh’s  solution  for  the  plate  with 


248 


WILLIAM  HOVOAARD 


normal  webii  applies  with  small  modification  to  a  plate  of  the  form 
shown  in  FiR.  15.  ImaRine  the  part  outside  FiAi  to  be  turned  upwards, 
and  the  part  outside  FiAt  to  be  turned  downwards  until  they  are  normal 
to  the  plate.  Cut  the  plate  in  two  alonR  the  center  line  ED  and  now 
join  the  two  plates  so  obtained  so  that  the  edRes  of  the  centre  line 
become  the  external  edges;  then  we  shall  have  the  test  piece  analyzed 
by  I>r.  Y'eh,  except  that  a,  stresses  of  small  maRnitude  will  act  normal 
to  the  Rreater  part  of  the  external  edRes,  that  is,  the  centre  line  in 
FiR.  15.  The  throat  area  of  the  welds  is  replaced  by  the  plate  edRes 
F\A\  and  F%Ax. 

The  problem  of  stresses  in  welds  is  one  of  increasinR  practical  impor¬ 
tance  and  much  more  comprehensive  than  would  appear  from  the 
present  invcstiRation.  We  have  here  dealt  only  with  lonRitudinal  welds, 
that  is,  welds  runninR  in  the  direction  of  the  pull,  but  transverse  welds 
occur  in  a  variety  of  forms  and  their  stress  analysis  may  prove  more 
difficult  to  deal  with.  The  solution  here  developed  of  one  particular 
aspect  of  weldinR  is  not  offered  as  a  finality,  but  rather  as  the  first  step 
into  a  new  and  extensive  field  of  research. 


FORMULAE  GIVING  THE  CHANGE  IN  GREEN’S  FUNCTION 
AND  IN  THE  CONJUGATE  FUNCTION* 

Bt  J.  G.  ESTES 

Let  G  be  the  Green’s  function  with  pole  at  infinity  for  the  region 
exterior  to  the  contour  C  in  fig.  1,  C  be  the  Green’s  function  with  pole 
at  infinity  for  the  region  exterior  to  the  contour  C'\  and  let  H  and  /7 
be  the  functions  conjugate  to  G  and  0,  respectively.  In  this  paper  is 
developed  formulae  for  finding  the  change  at  P  in  G  and  in  H  when 
the  circle  C  changes  into  the  nearby  contour  C'.  These  formulae  may 


be  used  in  any  problem  where  the  pole  of  the  Green’s  function  is  at 
infinity,  and  where  the  region  involved  can  be  mapped  on  a  circle.* 

'  J.  Hsdamard  sives  a  formula,  for  the  rhanse  in  Green’s  function  when  the 
pole  is  at  any  point  in  the  plane  (Calcul  des  Variations,  p.  303),  of  which  the 
present  formula  is  a  special  case.  However,  the  method  of  development  used  in 
this  paper  is  believed  to  be  new.  Any  other  formula  for  i/f  is  not  known  by 
the  author. 

*  See  a  paper  by  the  author  entitled  “The  Lift  and  Moment  of  an  Arbitrary 
Aerofoil”  in  the  Journal  of  the  Aeronautical  Sciences. 


240 


250 


J.  G.  E8TE8 


The  function  which  maps  the  region  exterior  to  C  on  the  unit  circle, 
the  point  at  infinity  going  into  the  origin,  is  tr  »  r/z,  where  z  »  pe^. 
Hence* 

G  —  /n  p,  r 

(1) 

and 

H  ~  p. 

(2) 

Let  us  now  proceed  with  the  derivation  of  the  formulae  for  iG  and  6//. 
If  C  (fig.  1)  is  a  grounded,  perfect  conductor,  and  a  unit  charge  of 
electricity  is  placed  at  infinity,  a  charge  of  equal  and  opposite  sign  will 
be  induced  on  C,  and  the  distiibution  of  this  charge  on  C  will  be  uniform. 
0(P)  is  defined*  as  the  potential  at  P  of  the  charge  induced  on  C.  Let 

us  find  the  potential  at  the  point  P  of  this  induced  charge.  The 

potential  at  P  due  to  the  charge  on  a  small  length  of  arc  ds  is  ^^’21n^  ,* 

2irr  oo 

and  the  total  potential  at  P  is 

1  /■»'  (P 

G[r(a)]~-  In^jda  (3) 


the  notation  6^(0)]  meaning  that  the  value  of  G  at  P  depends  on  the 
contour  r  —  r(a).  The  quantity  do  is  determined  by  setting  the  above 
integral  equal  to  the  known  value  of  G  from  eq.  (l)  i.e.. 


da 


\n^. 

r 


From  eq.  (4),  do  - 
G|r(a)l 


p*  +  r*  —  2pr  W)8  (tp  —  a) 
d* 

pr  and  eq.  (3)  becomes 

I 

1  ,  p*  +  r*  —  2pr  cos  (^  —  a) 

-2.},  - 


da. 


(4) 


The  functional  derivative*  G'(r(a)l  is 

</(,(„)]  -  *  ^  In  + 

2ir  Of  pT 


*  Kellogg,  O.  D.:  Foundations  of  Potential  Theory,  page  365. 

*  Kellogg,  O.  D. :  Foundations  of  Potential  Theory,  page  236. 

*  MacMillan,  W.  O. :  Theory  of  the  Potential,  page  35. 

*  Volterra,  V'.:  I.e^ons  sur  les  Fonctions  de  Lignes. 


CHANGE  IN  GREEN’S  AND  IN  CONJUGATE  FUNCTIONS  251 


1 

r»  -  p» 

“  2irrp* 

-Hr*  —  2pr  cos  (^  —  o) 

-2S[' 

-H  —  cos  (^  —  o)  -H  2(r/p)*  cos  2(^ 
9 

-  a)  -H  •  •  •  J 

(6) 

_  _  1  1 
ST  1 

^  2  '1'" 

(7) 

where  xo  -  l,x» 

-  (r/p)'*cosnv»,7,  "  (r/p)"sin  rup. 

The  change  in  0  at 

P  due  to  a  small 

element  of  arc  of  C  moving  to  C'  along  a  normal  to  C 

will  be  0'[r{a)]‘6N -da,  and  the  total  change  in  G  at  i 

P  will  be 

iG{P)  ~  G'[r{a)]  iN  da 

-r 

|-  -H  (x»  cos  na  -H  7,  sin  ^ 

f 

Mo  -H  /  j  na  A-  Bn  sin  na) 

^  1 

da 

where  6N  "•  Aq 

f» 

-H  ^  Mu  cos  na  +  Bn  sin  na) 

1 

_  _  1 
r 

• 

2  (^-X-  +  Bnyn) 

(8) 

and  we  have  G(P)  given  by  the  relation 

G{P)  -  0{P)  +  iG(P) 
where  iO{P)  is  given  by  eq.  (8). 


We  wish  now  to  derive  a  formula  for  finding  hH.  We  shall  first  set 
up  /f[r(a)]  as  an  integral  around  the  contour  C,  as  was  done  in  eq.  (3) 
for  0.  The  integrand  in  this  integral  will  be  a  function  which  is  conju¬ 
gate  to  the  integrand  in  eq.  (3).  Such  a  function  is 


tan“* 


p  sin  y  —  r  sin  a 
pcos^  —  rcosa 


+  A 


where  is  a  constant. 


If/J. 


V 


I  rTtan-'”"”'’"’’”'"* 

2ir^  L  p  008^— rcosa 


da 


'  Franklin,  P.:  Differ«ntial  ESquations  for  Electrical  Engineers,  page  272. 


252 


J.  G.  ESTES 


Hence  we  may  write 


H|Ka)l  -  i  ,]d.  (9) 

2ir  yo  L  p  cos  ^  —  r  coe  a  J 

and  therefore 

,  V,  1  d  r .  p  sin  ^  -  r  sin  o  *1 

H'[r(a)]  -  K-  ^  tan - »■ 

2ir  dr  L  p  cos  ip  —  r  cos  a  J 

_p  - ""  <f  -  °) - .  (10) 

Tp*  +  r*-2prcoe(^-a)  '  ^ 

k 

The  Fourier  expansion  of  eq.  (10)  is  found  by  multiplying  the  right 
hand  side  of  eq.  (6)  by  sin  —  a),  and  simplifying.  We  have 

//'(r(a)l  “  »>n  -  a)  +  {r/p)'  sin  2  (^p  -  a)  +  •  •  •  j  (11) 

J  n 

-  —  /jiUn  COS  na  -  x«  sin  na)  (12) 

xr 

1 

where  x*  and  yn  are  as  in  eq.  (7).  The  total  change  in  A/  at  P  is 
iH{P)  ~  j\'\r(a)YbS  da 

-  jf  ^2  ijnCmna  -  x«  sin  ^ 


(7«  COB  na  —  x«  sin  na)  )  X 


A%  2  (■^«  cos  na  +  Bh  sin  na)  Ido 


f» 

^  2  ~  BnXn) 


and  hence 


ff(P)  -  »(P)  +  6H(P) 
where  iH{P)  is  given  by  eq.  (13). 


RKGIONS  OF  PasiTIVE  AND  NEGATIVE  CURVATURE  ON 
CLa‘?ED  SURFACES* 


Bt  Philip  Fhankun 

The  Rurface  integral  of  the  GauRaian  curvature,  k,  taken  over  the 
whole  of  a  regular,  cloRcd  surface  of  topological  genus  p,  depends  only 
on  this  genus.  In  fact: 

j  kdS  ~  4ir(l  -  p).  (1) 

In  particular,  if  the  surface  has  the  topological  character  of  an  anchor 
ring,  p  *  1  and  the  integral  is  zero.  Thus  there  are  points  on  the 
surface  where  the  curvature  is  positive,  and  also  points  where  it  is 
negative.  On  the  anchor  ring  itself  these  points  form  two  regions, 
neither  of  which  is  topologically  equivalent  to  a  circle.  This  led 
J.  I!>ouglas*  to  raise  the  question  as  to  whether  this  was  a  general 
pmperty  of  surfaces  of  genus  one.  A  similar  question  for  closed  sur¬ 
faces  of  higher  genus  is  whether  on  such  a  surface  there  can  be  a  region 
of  positive  curvature,  or  one  of  negative  curv’ature  of  the  connectivity 
of  a  circle. 

In  this  note  we  show  by  examples  that  these  properties  do  not  hold 
for  all  surfaces.  However,  if  on  a  surface  of  any  genus,  a  region  of 
positive  (or  negative)  curvature  contains  no  umbilics,  either  inside  or 
on  its  boundary,  and  the  bounding  curve  of  zero  curvature  has  no 
singular  points  or  points  where  a  normal  section  of  the  surface  has  a 
point  of  inflection  at  the  point,  then  the  region  is  not  equivalent  to 
a  circle. 

The  method  of  proof,  essentially  an  application  of  Euler’s  theorem, 
is  also  applied  to  a  theorem  on  the  existence  of  umbilics  and  to  the 
calculation  of  the  total  curv'ature  of  one-sided  surfaces. 

/.  The  example*.  Fig.  1  shows  a  surface  of  genus  one  in  which  the 
region  of  positive  curvature  is  equivalent  to  a  circle,  and  fig.  2  one  in 
which  the  region  of  negative  curvature  is  equivalent  to  a  circle.  The 
surfaces  are  in  each  case  easily  visualized  as  slightly  deformed  anchor 
rings.  It  would  be  possible  te  construct  them  analytically  by  making 

'  Presented  to  the  Amerirsn  MsthemstirsI  Hociety,  April,  1931. 

*  Bulletin  A.  M.  S.,  vol.  36,  1930,  p.  798  abstract  no.  395. 

2.V3 


254 


PHILIP  FRANKLIN 


a  plane  of  symmetry  cut  the  surface  in  a  circle  and  another  curve  with 
two  points  of  inflection  as  in  the  figures,  and  having  the  sections  per¬ 
pendicular  to  this  plane,  and  along  the  radii  of  the  circle,  themselves 
circles. 

Fig.  3  illustrates  the  construction  of  a  surface  of  genus  3,  with  a  region 
of  positive  curvature  topologically  similar  to  a  circle.  It  is  clear  that 
by  having  p  holes  in  the  surface,  an  example  for  genus  p  is  obtained. 


A  surface  of  genus  2,  with  a  region  of  negative  curvature  equivalent 
to  a  circle,  is  illustrated  in  fig.  4.  For  the  case  of  genus  p,  we  have 
merely  to  make  the  body  of  the  figure  one  of  revolution  about  AB  bls 
an  axis,  and  distribute  p  handles  symmetrically  about  this  axis. 

i.  Regular  bounding  curves.  The  bounding  curves  separating  a 
region  of  positive  from  one  of  negative  curvature  are  curves  along  which 


cr 


REGIONS  OF  POSITIVE  AND  NEGATIVE  CURVATURE  265 

the  curvature  is  sero.  We  shall  call  such  curves  regular  if  they  contain 
no  umbilical  points,  singular  points  of  the  curve,  or  points  for  which 
some  normal  section  of  the  surface  has  a  point  of  inflection  at  the 
point .  We  proceed  to  show  that  regular  bounding  curves  are  necessarily 
lines  of  cur\’ature. 

We  refer  the  surface  to  axes,  two  of  which  are  tangent  to  the  lines  of 
cuiA’ature  at  the  point  in  question,  the  third  being  normal  to  the  sur¬ 
face,  so  that  the  equation  of  the  surface  has  the  form: 

t  •  At*  +  By^  -|-  o(ff»),  -  x*  -H  y\  (2) 

A 


At  a  point  of  zero  curvature,  AB  •  0,  and  if  the  point  is  not  an  umbilic, 
A  ^  B.  Thus  one  of  these  is  lero,  say  B,  and  we  may  write: 

r  -  1/2  ifcx*  +  l  /6(ar*  -|-  36x*y  +  Sexy*  +  dy*)  +  o(R*).  (3) 

This  gives  for  the  partial  derivatives: 

r  -  +  ox  +  -b  o{R), 

«  -  z,,  -  fex  -t-  ry  -I-  o(R), 

I  »  z„,  -  cx  -f  dy  -H  o{R). 


(4) 


256 


PHILIP  FRANKLIN 


Since  the  Gaussian  curvature  is  given  by* 

/Cd  +  p*  +  7*)  -  n  -  «*,  (5) 

the  equation  of  the  cur\’e  of  sero  curvature  through  the  point  in  ques¬ 
tion  is: 

(*  +  ox  +  +  o(AJ)](cx  -I-  rfy  +  o(/?)]  -  {5x  cy  o(f2)I*  -  0,  (6) 

or 

k{ci  +  dy)  +  o(R)  -  0.  (7) 

If  the  curve  has  no  singular  point,  we  can  not  have  both  c  and  d  »  0. 
But,  if  d  ^  0,  the  normal  section  of  the  surface  (3)  by  the  plane 
X  *  0  is: 


r  -  l/6dp*  +  o(ft*),  (8) 

which  has  a  point  of  inflection  at  the  origin.  Hence,  in  view  of  our  last 
condition  of  regularity,  this  is  excluded.  Thus  we  must  have  d  «■  0, 
c  ^  0,  so  that  the  tangent  to  the  curve  (7)  at  the  origin  is  x  »  0,  which 
is  tangent  to  a  line  of  curvature. 

Thus,  since  the  regular  bounding  curv’e  is  at  each  point  tangent  to 
a  line  of  curvature,  it  must  itself  be  such  a  line. 

S.  The  region.  Let  us  now  consider  a  region  of  positive  (or  negative) 
curvature,  containing  no  umbilical  points  in  its  interior,  and  bounded 
by  a  regular  curve.  We  construct  a  series  of  lines  of  curvature  of  both 
families  in  this  region,  forming  an  orthogonal  network  which  divides 
the  region  into  curvilinear  rectangles.  At  each  point  of  the  boundary 
two  rectangles  abut,  and  at  each  inside  point  four  rectangles  abut. 
Consequently,  if  there  are  m  interior  points  and  2n  points  on  the 
boundary,  we  shall  .have  for  the  number  of  vertices,  lines,  and  rec¬ 
tangles,  i.e.  sero-,  one-  and  two-cells,* 


ae 


m  -1-  2n,  oi 


4m  -f-  6n 
2 


2m  -t-  3«,  at 


4m  +  4r» 
4 


m  -h  n. 


We  find  from  this  for  the  Euler  characteristic: 

ao  —  c«i  +  a*  *  0-  (9) 

This  shows  that  the  region  is  not  topologically  equivalent  to  a  circle, 
since  the  characteristic  is  one  for  a  circle.  In  fact,  on  an  orientable, 

•  Eisenhsrt,  Differential  Geometry,  1900,  p.  126. 

‘  For  the  notation,  and  invariance  of  the  characteristic,  see  e.g.  I..efschets, 
Topology,  1930,  p.  44. 


REGIONS  OF  POSITIVE  AND  NEGATIVE  CURVATURE  257 


or  two-fiided  surface,  the  only  region  satisfyinK  (9)  is  a  huK  shaped  one. 
For  the  anchor  rin^  our  conditions  are  realised  on  both  sides,  and  the 
regions  of  poeilive  and  negative  curvature  each  have  the  topology  of  a 
ring.  In  the  examples  shown  in  figures  1  to  4,  the  bounding  curves 
have  angular  points,  at  which  the  surface  has  normal  sections  with 
points  of  inflection. 

I’mbilics.  The  type  of  argument  use<l  in  the  preceding  section 
may  be  used  to  show  the  existence  of  umbilics  under  certain  conditions.* 
In  general,  the  lines  of  curvature  near  an  umbilical  point*  have  the 
form  shown  in  fig.  5  (type  I),  or  that  of  fig.  6  (type  II).  The  most 
familiar  type  of  fig.  7  (type  III),  found  in  the  meridians  and  parallels 
of  a  surface  of  revolution  near  a  point  on  the  axis,  is  more  special  in 
that  it  requires  the  vanishing  of  all  third  order  terms  of  the  surface. 
The  umbilics  on  an  ellipsoid  are,  essentially,  a  special  case  of  our 
second  type. 


Fig.  5 


Fig.  6 


Fig.  7 


Let  us  consider  a  closed  surface,  and  draw  on  it  series  of  lines  of 
curvature  belonging  to  both  families.  If  the  network  is  sufficiently  fine, 
for  the  most  part  the  surface  will  be  divided  into  curvilinear  rectangles. 
However,  this  will  not  be  the  case  near  an  umbilic,  where  we  may  regard 
the  effect  of  the  umbilic  to  be  the  replacement  of  a  block  of  rectangles 
by  a  single  polygon.  We  next  calculate  the  contribution  of  our  rec¬ 
tangles,  as  well  as  that  of  the  polygons  about  the  different  types  of 
umbilics,  to  the  Euler  characteristic, 

ao  “  Qi  d"  oj  “  (19) 

For  a  rectangle,  we  have  four  vertices,  four  sides,  and  one  region. 

*  Cf.  Blaschkfl,  Math.  Zeitsohrift,  vol.  23  (1925),  p.  617,  where,  however,  the 
author  thought  of  the  argument  as  involving  angles,  rather  than  merely  topology, 
and  so  restricted  his  discussion  to  analytic  surfaces. 

*  Darboux,  I.econs  des  Surfaces,  vol.  IV  (1896)  Note  VII,  p.  448,  where  ref¬ 
erences  to  the  earlier  work  of  Cayley  are  given. 


258 


PHILIP  FRANKLIN 


But,  an  four  rectangle8  abut  on  a  point,  and  two  on  a  Hide,  the  contribu* 
tion  to  K  ia: 

4(1/4)  -  4(1/2)  +  1-0.  (11) 

Similarly  we  see  from  6k.  5  that  an  umbilic  of  type  I  is  surroundeil 
by  a  hexagon  with  n  vertices  interior  to  the  sides  which  contributes 

ao  -  6(1/4)  +  n(l/2),  a,  -  (6  +  n)(l/2),  o,  -  1, 

so  that 

K  -  -1/2.  (12) 

An  umbilic  of  type  II,  hg.  6,  is  surrounded  by  a  polygon  of  two  sides 
with  n  vertices  interior  to  the  sides,  and  contributes 

a,  -  2(1 '4)  +  n(1  2),  a,  -  (2  +  n)(l/2),  a,  -  1, 

so  that 

A'  -  1/2.  (13) 

Am  umbilic  of  type  III,  6g.  7,  is  surrounded  by  a  circle  with  n  interior 
vertices,  so  that  it  contributes 

at  —  n(l/2),  Qi  —  w(l  2),  oi  —  1, 

from  which 

A  -  1.  (14) 

Suppose  our  closed  surface  has  L't,  f  ’l,  L\,  umbilics  of  types  I,  II, 
and  III  respectively,  and  no  umbilics  of  higher  types.  Then,  from 
equations  (11)  to  (14),  the  Euler  characteristic  is  determined  from  our 
mesh  to  be: 

-fV2  +  r,/2  +  f’l  -  A  -  2  -  2p.  (15) 

For  example,  if  the  surface  is  topologically  a  sphere,  p  —  0,  and 
the  right  side  is  2.  Consequently  there  must  be  either  two  umbilics 
of  type  III,  four  of  type  II,  or  more  of  one  or  other  of  these  if  any 
umbilics  of  type  I  are  present.  Familiar  illustrations  of  the  hrst  two 
cases  are  furnished  by  an  ellipsoid  of  revolution,  and  one  with  three 
unequal  axes. 

For  p  —  1,  there  may  be  no  umbilics,  as  the  anchor  ring  itself  illu.s* 
t rates.  For  surfaces  of  higher  genus,  there  must  be  umbilics  of  type  I. 

Our  conclusions  are  not  proved  in  case  the  surface  has  umbilics  of 
higher  order  than  the  types  considered. 


REGIONS  OF  POSITIVE  AND  NEGATIVE  CI  RVATURE  259 


o.  ()ne-»ided  surfacett.  The  Gau88>Bonnet  formula  for  the  integral  of 
cur\'ature  (1)  is  usually  applied  to  two-sided,  or  orientable  surfaces. 
How’ever,  it  holds  for  one-sided,  or  non-orientable  surfaces  as  well, 
since  it  is  an  immediate  consequence  of  Fluler’s  theorem  (10),  and  Gauss’ 
theorem  that  the  integral  of  curvature  over  a  geodesic  triangle  equals 
its  spherical  excess.^  For,  if  we  cover  a  closed  surface  with  a  network 
of  gPtMlesic  triangles,  we  have  2ai  —  3ai,  (or  at  —  2ai  —  2at)  since  each 
2-cell  is  a  triangle,  and  each  1-cell  abuts  on  two  2-oells.  Also,  if  Aj  is 
any  angle  of  a  triangle,  we  have  ZAi  —  2irao,  since  the  total  angle  about 
any  point  is  2r.  (\>nBequently,  we  have  for  the  total  excess: 

ZiF  "  ZA<  —  rat  “  »'(2ao  —  oj)  “  2r(at  —  aj  -J-  o*), 
and  hence  by  Gauss’  theorem  and  (10): 

j  kdS  •  2wK  -  4ir(l  -  p).  (16) 

In  this  equation,  the  characteristic  K  is  more  appropriate  for  one-sided 
surfaces,  since  some  of  these  would  have  non-integral  values  of  p. 

The  equation  (15)  for  a  relation  between  umbilical  points  also  applies 
to  one  sided  surfaces.  In  both  (15),  and  (16)  our  surfaces  must  be 
closed  and  smooth  as  to  'tangent  plane  and  curvature.  A  one-sided 
surface  of  this  sort  with  /f  —  1  was  given  by  W.  Boy.*  It  has  self¬ 
intersections,  three  curves  along  which  two  sheets  of  the  surface  inter¬ 
sect,  and  one  point  where  three  sheets  intersect,  but  it  is  smooth  along 
each  sheet.  Fmm  this,  smooth  surfaces  of  any  genus  are  easily  con¬ 
structed.  For,  by  cutting  a  small  hole  in  one  of  the  outer  parts  of  the 
Boy  surface,  and  a  similar  hole  in  one  part  of  a  sphere  with  p  handles, 
the  two  may  be  joined  by  a  tube  in  such  a  way  that  the  curvature  is 
continuous,  to  give  a  one-sided  surface  with  /C  *  1  —  2p.  Similarly, 
if  we  join  two  Boy  surfaces  with  a  sphere  with  p  handles,  we  obtain  a 
one-sided  surface  with  K  »  —  2p.  These  last  are  more  directly  con¬ 
structed  by  taking  a  sphere  with  p  ordinary  handles,  and  one  handle 
formed  like  the  Klein  one-sided  surface  of  genus  zero.*  To  all  of  these 
surfaces,  the  theorems  of  sections  3  and  4  apply. 

The  Boy  surface  itself  has  its  region  of  positive  curvature  equivalent 

’  rf.  RIaarhkr,  DiffrrrntislKeometrip,  vol.  I,  1930,  p.  165. 

*  The  surface  is  deseribed  and  illustrated  in  Hilbert  and  Cohn-Vossen,  An- 
sehauliehe  Geometric,  1032,  p.  280.  They  state  on  p.  285  that  it  has  not  yet  been 
investixated  whether  such  smooth  representations  of  other  one-sided  surfaces 
exist,  apparently  overlookinx  the  simple  construction  here  given. 

*  Hilbert  and  Cohn-V'ossen,  1.  e.  p.  271. 


260  PHILIP  FRANKLIN 

to  a  circle,  and  that  of  negative  cur\’ature  like  a  Mdbiu8  8trip.  Whether 
8uch  extreme  examples  aa  those  given  for  two-sided  aurfaces  exiat  is 
doubtful,  though  since  the  tube  connecting  the  Boy  surface  with  the 
sphere  with  p  handles,  which  may  be  that  of  hg.  3  or  4,  may  be  attached 
an3rwhere,  it  is  clear  that  one-sided  surfaces  exist  in  three  spaee  of  eveiy 
genus  having  as  part  of  their  regions  of  positive  (or  negative)  curv’ature 
a  region  equivalent  to  a  eircle,  and  bounded  everywhere  by  points 
where  the  curvature  has  the  opposite  sign. 


EFFECT  OF  SURFACE  DISCONTINUITY  ON  THE 
DISTRIBUTION  OF  POTENTIAL 

Bt  H.  B.  Philups 

1.  Statement  of  Problem.  Suppose  the  region  inside  a  finite  closed 
surface  S  is  occupied  by  a  homogeneous  dielectric  of  constant  ci  and  that 
outside  by  one  of  constant  <i.  The  distribution  of  potential  on  both 
sides  is  modified  by  the  presence  of  such  a  surface,  and  it  is  the  object 
of  this  paper  to  exhibit  this  effect  by  means  of  integrals  extending  over 
the  surface.  We  shall  consider  two  cases,  namely, 

1)  the  field  associated  with  given  charges,  and 

2)  the  field  in  which  the  potential  is  assigned  on  a  given  surface. 

Similar  formulas  are  obtained  if  instead  of  the  dielectrics  we  have  two 

media  of  different  conductivities  through  which  steady  currents  are 
flowing.  ■  But  for  definiteness  we  shall  use  only  the  language  of  electro¬ 
statics. 

2.  Potential  Due  to  Given  Charges.  We  use  small  letters  p,  q,  r 
to  represent  points  on  S  and  capitals  to  represent  points  not  on  S. 
Let  ^(P)  be  the  potential  at  P  due  to  a  certain  distribution  of  charges 
in  free  space,  and  ^(P)  the  potential  due  to  the  same  charges  in  the 
dielectrics.  We  shall  determine  the  relation  between  the  functions 
^(P)  and  ^(P).  In  some  of  these  discussions  we  use  the  notation 
^(P)  to  represent  the  value  of  ^(P)  at  a  point  P  inside  S  and  ^t(P)  to 
represent  its  value  at  a  point  P  outside. 

If  p  is  the  density  of  charge, 

c 

Except  for  points  on  S  the  function 

^(P)  -i^(P) 

c 

is  thus  harmonic  everywhere.  At  a  point  inside  S  Green’s  formula* 
therefore  gives 

*  Kellogg,  Potential  Theory,  page  223. 

281 


262 


H.  B.  PHILLIPS 


where  r  is  the  distance  between  P  and  the  element  of  area  dq  at  the 
point  q  on  S.  Since 

^(P) 

•t 

is  harmonic  outside  S, 

r  being  the  distance  from  a  point  P  inside.  On  the  surface 


^1 

dn 


**  dn  ’ 


(3) 


and  ^  is  assumed  to  be  continuous  and  to  have  continuous  &rst  deriva¬ 
tives.  Multiplying  (1)  by  ci,  (2)  by  ct,  subtracting  and  using  (3),  we 
obtain 


(4) 


f  4^^-dq. 

4ir  Js  dn  r 
In  a  similar  way  for  a  point  P  outside  we  obtain 

.,^(P)-^(P)+‘-l^  [  ^lldq.  (5) 

4t  Jb  on  r 

When  P  approaches  a  point  p  on  the  surface,  (4)  and  (5)  give  as  limits 


«i^i(p)  -  ^'(p)  + 


*1  — 


4t 


Js  dnr 


ft  —  ti 


*(p) , 


tt^(p)  -  ^(p)  +  ** 4;^~  /  ^ ^  r  ^*^  "*" **  2  ' ‘ ’ 


either  of  which  is  equivalent  to 


*(p)  _  /’♦-ild, , 

«i  -f-  Cl  2»'(€j  -f-  Cl)  Jb  dn  r 


If  we  write 


Cj  —  Cl 


cj  +  Cl 

the  last  equation  becomes 


\  1  d  1 


/(p) 


2^(p) 

c»4-  «i’ 


(6) 


^(p)  -  /(p)  +  ^  ^ 


(7) 


DISTRIBUTION  OF  POTENTIAL 


263 


Equation  (7)  is  the  integral  equation  ordinarily  used  in  the  solution 
of  the  Dirichlet  problem.  For  it  the  characteristic  values  of  X  are  real 
and  in  numerical  value  equal  to  or  greater  than  1.*  If  ci  and  ci  are 
positive,  the  value  of  X  determined  by  (6)  is  numerically  less  than  1. 
Hence  (7)  has  a  unique  solution, 

Mp)  -  fip)  +  X  R  ip,  q,  X)  dq  ,  (8) 

where 


R  ip,  9,  X)  -  ip,  q)  +  XXi  ip,  q)  +  X*/C,  ip,  q)  +  .  .  .  (9) 

is  the  resolvent.* 

For  a  point  P  not  on  S  we  write 


where  r  is  the  distance  between  P  and  the  point  q  at  which  the  normal 
derivative  is  taken.  Similarly,  we  write 

KiiP,q)  ~  j^K(P,r)Kir,q)dr, 

K„iP,q)  •  f  KiP,r)K,^,ir,q)dr, 


R  iP,  9,  X)  -  /C  iP,  q)  +  \K,  iP,  q)  +  iP,q)  +  ...,  (11) 
whence 

RiP,  q,  X)  -  KiP,  9)  +  X  KiP,  r)  Rir,  q,  X)  dr  ,  (12) 

from  which  the  convergence  of  (11)  follows. 

Multiplying  both  sides  of  (8)  by  K  iP,  p)  dp  and  integrating,  we 
obtain 


Jjip)KiP,p)dp 

-  jjip)  KiP,  p)dp-^\jj  fiq)  KiP,  p)  Rip,  q,  X)  dp  dq  . 


'  Kellogg,  loc.  cit.,  pages  300, 310. 
'  Kellogg,  loc.  cit.,  page  280. 


264 


H.  B.  PHILLIPS 


Replacing  the  mute  symbol  p  by  9  on  the  left  and  simplifying  the  right 
side  by  use  of  (12),  this  becomes 

^  m  K{P,  q)dq~  jm  R(P,  q,  X)  dq  .  (13) 

From  (4)  and  (6)  we  thus  have 

*iMP)  -  HP)  +  X  ^  m  R{P,  q,  X)  dq . 

Similarly,  if  P  lies  outside  S, 

0  ^,(P)  -  HP)  +  X  Hq)  RiP,  q,\)dq. 

The  last  two  equations  are  equivalent  to  the  single  relation 

t^iP)  -  HP)  +  X  ^  Hq)  R{P,  q,  X)  dq ,  (14) 

where  c  is  the  dielectric  constant  at  P.  This  is  the  potential  at  P  due 
to  the  system  of  charges  which  in  free  space  would  determine  the  poten¬ 
tial  HP)-  Th®  value 

c 

obtained  by  taking  only  the  first  term  on  the  right  of  (14),  is  the  poten¬ 
tial  in  a  homogeneous  dielectric  of  constant «.  The  second  term  may  be 
regarded  as  an  added  effect  due  to  reflection  and  refraction  in  the 
surface  of  discontinuity. 

3.  Potential  Assigned  on  a  Given  Surface.  Let  iS  be  a  closed 
surface  containing  a  closed  surface  S9,  the  two  surfaces  having  no 
points  in  common.  Let  ^  be  a  function  continuous  on  So  and  harmonic 
outside,  ^1  a  function  equal  to  ^  on  So  and  harmonic  between  So  and  S, 
a  function  harmonic  outside  S,  and  on  S  let 

...  301  301 

01  »=  01  “  0»  «i  r-  =  «i  r —  •  tlo) 

on  on 

W e  wish  to  express  0i  and  0t  in  terms  of  0. 

For  that  purpose  let 

giP,Q)^-  +  eiP,Q) 

T 


(16) 


DISTRIBUTION  OF  POTENTIAL 


265 


be  the  Green’s  function  for  the  region  outside  5#.  If  P  is  a  point  be¬ 
tween  S»  and  S,  by  Green’s  formulas  we  have 

♦,(p)  -  *(.p)  -  i  jf  [»(p,  (♦,  -t)±  ,(p, ,)]  rf, , 

the  integral  over  So  being  lero  since 

giP,  Q)  »  *  0 

for  points  Q  on  So.  Since  0(Q)  and  g{P,  Q)  are  harmonic  at  points  Q 
outside  S,  the  terms  containing  0  vanish,  leaving 

*,(P)  _  ^(/.)  .  ^ ^  (17) 

Since  0t(Q)  is  also  harmonic  outside  S, 

Multiplying  (17)  by  et,  (18)  by  e»,  subtracting,  and  using  (15),  we  get 

«.  l0.(P)  -  0(P)1  -  g(P,  g)  dq .  (19) 

If  P  lies  outside  S  the  functions  g{P,  Q),  0i(Q),  0(Q)  are  harmonic  at 
points  Q  between  So  and  S,  and 

9(p',Q)  =  0.(Q)  -  ^(Q)  -  0 

at  points  Q  on  So.  Hence 

®  ^  i  ^ 

Since  0i(Q),  0(Q)  are  harmonic  outside  S  and 

il(P,  0)  =  i  Q) 


differs  from  -  by  a  function  harmonic  outside, 
r 


(21) 

(22) 


266 


H.  B.  PHILLIPS 


the  negative  signs  being  due  to  the  fact  that  the  normal  is  directed  into 
the  region  containing  P.  Multiplying  (20)  by  ct,  (21)  by  —  ci,  (22)  by 
ct,  adding,  and  using  (15),  we  get 

^^(P)  -  tiMP)  -  ^  ^  g(P,  q)  dq  .  (23) 

4ir  Ja  vW 


Since  B{P,  Q)  is  harmonic  everywhere  outside  So,  g(P,  Q)  has  the  same 
singularities  as  ^  at  points  of  S.  When  P  approaches  a  point  p  of  S 
equations  (10)  and  (23)  thus  give  as  limits 

«i(^i  -  -  *^4~  /  ^  ^  ~~  ^  2~ '  ^  ’ 

—  t\^  »  ^  ^  q)  dq  **  2  *'  , 

either  of  which  is  equivalent  to 


0(p) 

2c, 

«j  +  *1 

a 

an 

gip,  q) 

dq. 

(24) 

If  we  write 

X  «J  —  <1 

^  “  1  • 

K(p,  </) 

« 

2c, 

«J  +  «! 

Up) , 

(25) 

this  becomes 

♦<p)  - 

/ip)  +  X  /c  ^iq)  Rip,  < 

q)  dq. 

(26) 

By  a  discussion  practically  identical  with  that  used  in  potential  theory 
(cf.  Kellogg,  pages  309,  310)  it  can  be  shown  that  the  characteristic 
values  of  X  in  (26)  real  and  in  numerical  value  not  less  than  imity. 
If  Cl  and  Cf  are  positive,  the  value  of  X  determined  by  (25)  is  less  than  1. 
Therefore  (26)  has  a  unique  solution, 

♦(p)  -  /(p)  +  X  iaSiq)  Pip,  q,  dq ,  (27) 


where 


Rip,  q,  X)  -  K(p,  q)  -h  X/Ci(p,  q)  -|-  K,{p,  q)  +  .  .  .  (28) 

is  the  resolvent. 

In  case  of  a  point  P  not  on  S  we  write 


(29) 


DISTRIBUTION  OF  POTENTIAL 


267 


Similarly, 

Ki{P,q)  ^  S,K{P,r)Kir,q)dr, 

K,{P,q)  •  SsK{P,r)Kn.i{r,q)dr, 

R{P,  q,  X)  -  K{P,  q)  +  X/C,(P,  q)  +  \*K,iP,  q)  +  .  .  . 
whence 

R{P,  q,  X)  -  KiP,  9)  +  X  /,  KiP,  r)  R{r,  q,  X)  dr .  (30) 

Multiplying  both  sides  of  (27)  by  KiP,  p)  dp,  integrating,  and  using 
(30),  we  obtain 

«(9)  KiP,  q)  dq  -  fiq)  RiP,  q,  \)  dq .  (31) 

From  (19)  and  (25)  we  thus  have 

«.  k.(P)  -  HP)]  -  ^  f(q)  RiP,  q,  X)  dq . 

Replacing /(g)  by  its  value  from  (25),  this  becomes 

«i^i(P)  “  *1  [HP)  +  X  /a  Hq)  RiP,  q,  x)  dq] . 

This  determines  the  potential  at  a  point  P  between  So  and  S.  In  a 
similar  way  from  (23)  we  obtain 

«t  MP)  «  «i  [HP)  +  X  /a  Hq)  RiP,  q,  x)  d?) 

as  the  potential  at  a  point  P  outside.  The  last  two  equations  are 
equivalent  to  the  single  equation 

.^(P)  -  (^(P)  +  X  /a  Hq)  RiP,  q,  X)  dq] ,  (32) 

where  c  is  the  dielectric  constant  at  P.  The  function  ^  is  harmonic  in 
the  region  outside  So  and  ^  is  the  potential  in  the  dielectrics  which  is 
equal  to  ^  on  So. 


A  MODIFICATION  OF  LEVl-Cl VITA’S  WAVE  EQUATION 

Bt  BANEflH  HorrMANN* 

$1.  Introduction.  In  a  recent  paper*  Levi-Civita  has  criticised  the 
use  of  auxiliary  ennuples  of  vectors  in  the  setting  up  of  a  generally 
invariant  form  of  the  Dirac  wave  equation  and  has  been  led  to  propose  a 
new  wave  equation  in  which  the  use  of  such  ennuples  is  entirely  avoided. 
The  significant  point  of  this  work  is  that  it  employs  a  wave  vector 
instead  of  the  half-vector  of  Dirac’s  theory  or  the  scalar  of  Schrodinger’s 
theory,  and  achieves  the  introduction  of  electron  spin  by  the  arbitrary 
addition  to  the  relativistic  Schrodinger  wave  equation  (for  a  \’ector)  of 
terms  involving  the  electromagnetic  six-vector  which  “couple”  the 
components  of  the  wave  vector. 

Now  in  the  ordinary  relativity  theory  the  contracted  second  covariant 
derivative  of  a  wave  vector  involves  terms  coupling  the  components 
of  but  these  terms  vanish  for  Galilean  coordinates  in  a  space  free 
from  gravitation;  Klein*  has  shown  that  the  relativistic  Schrodinger 
equation  can  be  expressed  very  concisely  in  terms  of  the  contracted 
second  covariant  derivative  of  a  scalar  wave  function  in  a  Kalusa-Klein 
five-dimensional  space.  The  Kalusa-Klein  theory  has  since  been  shown 
to  be  a  projective  four-dimensional  theory  and  as  such  has  received  many 
forms  at  the  hands  of  various  workers.*  From  the  nature  of  the  pro¬ 
jective  theory  one  would  expect  it  to  take  care  of  the  electromagnetic 
part  of  the  field  in  an  automatic  manner;  if  we  change  from  a  projective 
scalar  wave  function  to  a  projective  vector,  the  contracted  second  pro¬ 
jective  derivative  will  involve  several  terms  that  do  not  appear  in  the 
scalar  wave  equation;  these  terms  will  couple  the  components  of  the 
wave  vector  and  will  involve  the  components  of  the  projective  con¬ 
nection  ;  in  the  affine  case  considered  by  Levi-Civita  the  comp>onents  of 
the  affine  connection  vanish  for  Galilean  coordinates  in  a  space  free  from 
gravitation,  but  the  components  of  the  projective  connection  involve  the 

'  Department  of  Mathematics,  University  of  Rochester,  N.  Y. 

'  Levi-Civita,  Sits.  d.  Preuss.  Akad.  d.  Wiss.,  V,  240  (1033). 

*  O.  Klein,  Zeits.  f.  Physik  46, 188  (1927). 

*  In  this  paper  I  shall  use  the  generic  title  of  projective  relativity  to  denote  all 
aspects  of  the  theory  connected  with  the  names  Kalusa-Klein,  Veblen-Hoffmann, 
Einstein-Mayer  and  Schouten-van  Dansig. 

268 


MODIFICATION  OF  LEVI-CIVITA’8  WAVE  EQUATION  269 


electromagnetic  potentials  as  well  as  the  gravitational  potentials  and 
therefore  will  not  necessarily  vanish  in  the  Galilean  case;  consequently 
there  will  remain  coupling  terms  involving  the  electromagnetic  part  of 
the  field. 

This  fact  suggests  that  a  suitable  projective  vector  wave  equation 
might  automatically  contain  coupling  terms  that  would  serve  the  pur¬ 
pose  of  the  coupling  terms  introduced  by  Levi-Civita,  and  in  this 
paper  we  show  that  a  projective  vector  equation  can  be  constructed  for 
which  this  is  indeed  the  case. 

$2.  Notation.  We  shall  employ  a  notation  based  on  two  pre\iou8 
papers  on  the  projective  theory.*  We  shall  use  y  for  the  projective 
metric  as  in  I  and  shall  employ  the  symmetric  connection  F  of  that 
paper.  In  II  we  used  g  for  the  projective  metric  but  in  the  present 
paper  this  letter  is  reserved  for  the  g  used  in  I. 

As  usual  Greek  suffixes  run  from  0  to  4  and  Latin  suffixes  from  1  to  4, 
with  the  suffix  4  denoting  the  time-like  coordinate;  following  the  usage 
in  Veblen’s  recent  book*  we  shall  no  longer  refer  to  j*  as  the  factor  but 
as  the  gauge  variable. 

We  shall  have  occasion  to  use  the  reference  system  x*  of  II  and  shall 
adhere  to  the  notation  of  that  paper  so  far  as  suffixes  are  concerned;  the 
null  suffix  of  the  X-system  will  thus  be  denoted  by  6. 

The  components  of  the  electromagnetic  six-vector  will  be  denoted  by 
the  of  I. 

§3.  The  Wave  Equation.  We  take  a  projective  vector  of  index 
N  and  such  that  ■■  0;  this  is  an  invariantive  restriction  upon  i(/y 
since  is  a  projective  scalar. 

The  quantity  where  the  semi-colon  denotes  projective 

derivation  according  to  I,  has  five  components,  and  therefore  we  do  not 
equate  it  directly  to  zero  to  give  the  wave  equation;  it  must  first  be 
modified  so  that  we  shall  obtain  only  four  equations  for  the  four  non- 
vanishing  components  of  this  can  be  effected  in  an  invariantive  man¬ 
ner  quite  simply  by  writing  the  wave  equation  as 

-■  0 ,  (1) 

where  the  notation  [y*^^,,ms],-«  is  used  instead  of  y"^^-,m$  since  the 
latter  might  well  be  confused  with  the  contracted  second  projective 

*0.  Veblen  and  B.  Hoffmann,  Phya.  Rev.  36,  810  (1030),  hereinafter  referred 
to  aa  I, 

B.  Hoffmann,  Phya.  Rev.  43, 616  (1933),  hereinafter  referred  to  aa  II. 

*  "Projektive  Relativit&tatheorie,’*  Ergebniaae  der  Math.,  2,  Berlin,  1933. 
See  page  10. 


270 


BANESH  HOFFMANN 


derivative  of  the  projective  scalar  which  in  this  case  would,  of  course, 
be  lero. 

The  equation  (1)  is  identically  satisfied  when  7  ~  0  and  therefore 
yields  only  four  equations  for  the  four  f ’s. 

Now  any  projective  vector  equation  P.  ~  0  can  be  broken  up  into  the 
affine  equations  P*  ~  0,  Pe  «  0,  and  for  the  present  equation,  therefore, 
since  m  the  affine  equivalents  reduce  merely  to 

[y-V’:  J,-.  -  0,  (2) 

where  c  takes  on  the  values  1  to  4,  the  covariant  null  equation  being 
identically  satisfied,  as  we  have  already  seen. 

(4-  Explicit  Form  of  the  Wave  Equation.  We  must  now  find  a 
more  explicit  form  for  the  above  wave  equation  in  which  the  electro¬ 
magnetic  part  is  brought  into  evidence.  The  computation  is  greatly 
facilitated  if  we  go  over  to  a  new  gauge  variable  by  means  of  the  non- 
holonomic  gauge  transformation  given  symbolically  by 

x'  —  1 

[  (3) 

X*  -  I*  -I-  J 

this  is  the  gauge  transformation  used  in  II  to  change  from  the  formalism 
of  I  to  the  formalism  of  the  Einstein-Mayer  theory.  We  thus^  have 
in  the  new  reference  system 

Qmn  t  "tml  $  1 

j  ■■  5*^  J 

and 

r:,  -  L‘J  ,  '  ri,  -  ri,  -  -  ri.  -  -  (6) 

all  other  components  of  F  in  this  system  being  zero. 

In  the  X-system  ^  has  the  form 

ix  -  (6) 

and  further  we  find  that 

“  0.  (7) 

^  Cf.  I,  bearing  in  mind  the  differences  in  notation.  The  reason  for  the  minus 

signs  in  (6)  is  explained  in  II  |5,  below  Eq.  (27).  Note  that  Eq.  (26)  of  II  is  a 

misprint,  a  minus  sign  having  been  omitted  in  the  value  of  Fmn- 


7*»  “ 

ymn  ^ 


MODIFICATION  OF  LEVT4:iVITA’8  WAVE  EQUATION  271 


In  the  X-eyetem  the  wave  equation  in  the  form  (2)  becomes 

-  0,  (8) 

with  n  taking  on  the  values  1  to  4. 

Using  (5)  and  (7)  it  is  at  once  found  that  has  the  components 

mm  ^  ^  (9) 

where  the  comma  denotes  ajjine  covariant  derivation  with  respect  to  the 
g'%,  and  we  are  temporarily  omitting  the  somewhat  pedantic  notation 
used  in  (2)  and  similar  equations. 

We  now  have 

and  by  means  of  (4),  (5),  (7)  and  (9)  it  is  easily  found  that 

(7'“^':  -I-  A^V)  -  +  2^,  V.'^‘.  (10) 

The  wave  equation  may  therefore  be  written  in  terms  of  the  a- 
system,  bearing  in  mind  the  form  of  the  gauge  factor  in  (6),  as 

■  [ll**(d«  —  AV.)(d6  —  Nipi)  +  AT*) 

(11) 

-  2Ar^.  *^  +  2v>»  V.  *^*  -  0 

where  for  convenience  we  have  written  d  for  the  covariant  derivative 
previously  denoted  by  a  comma. 

{5.  Discussion  of  the  Wave  Equation.  In  order  to  get  the  correct 
constants  in  the  part 

-  (ir*(d.  -  N^m){d,  -  N^)  +  AT*)  r  (12) 

of  the  wave  equation  we  must  have 

N,Pm  -  i{e/hc)V. ,  AT*  -  (m*c*/A*) , 

i.e. 

AT  «  (mc/h),  if),  •  i(e/mc*)Vm,  (13) 

where  V.  is  the  electromagnetic  potential . .  .Vi,  Vi,  Vt  in  electro¬ 
magnetic  and  V4  in  electrostatic  units. 

This  fixes  all  the  constants  and  we  may  not  alter  their  values  in  an 
endeavour  to  make  the  extra  terms  give  the  results  we  want;  it  turns 
out  that  no  such  alteration  is  required. 


272 


BANE8H  HOFFMANN 


Remembering  that  the  are  defined  with  a  factor  of  one-half,*  we 
see  that  the  extra  terms  in  the  wave  equation  become 

-H  i{e/hc)  -  (e*/2m*c*)  F*  •  ‘  ^  (14) 

(3  V'  d 

— r—  )  of  the 
or  ox*  / 

electromagnetic  six- vector. 

Now  i«/hc)  is  of  the  order  of  10^  whilst  (c*/2m*c‘)  is  approximately 
10~*  so  that  so  far  as  the  external  part  of  the  field  is  concerned  the 
second  term  in  (14)  is  negligible  compared  with  the  first  for  external 
fields  such  as  are  met  with  in  practice.  The  total  field,  however,  in¬ 
cludes  the  field  due  to  the  nucleus  and  this  becomes  very  large  as  r,  the 
distance  from  the  centre  of  the  nucleus,  becomes  small;  if,  for  example, 
we  take  a  Coulomb  field  as  descriptive  of  the  effect  of  the  nucleus,  the 
two  terms  in  (14)  become  of  the  same  magnitude  when  r  is  about  10~'* 
cm.  It  has  often  been  suggested  that  the  Coulomb  law  is  only  an  ap¬ 
proximation  and  that  the  true  law  must  av'oid  the  infinity  at  r  ~  0, 
and  since  the  first  term  in  (14)  is  itself  merely  of  the  magnitude  of  the 
relativity  correction  term  it  appears  that  with  a  suitably  modified  law 
of  force  the  second  term  of  (14),  which,  as  r  increases,  falls  off  relative 
to  the  first  as  the  inverse  square  of  r,  would  contribute  a  negligible  cor¬ 
rection  to  the  energy  levels. 

With  such  an  assumption  we  may  omit  the  second  term  of  (14)  for  our 
present  purposes  and  write  the  wave  equation  in  the  approximate  form 

V  -I-  He /he)  ^  ^  -  0 .  (15) 

When  gravitation  is  absent  and  we  take  coordinates  such  that 

'  \-l  000 

0-100 

~  (10) 
0  0-10 

0  0  0  1 

the  operator  S  becomes  just  what  the  second  order  relativistic  operator 
of  Schrodinger  becomes  for  this  case,  and  it  does  not  couple  the  ^’s;  the 
coupling  is  now  performed  only  by  the  second  term  in  (15). 

Following  I^evi-Civita  we  consider  the  two  special  cases  of  a  purely 


*  Cf.  I,  top  of  p«ae  814. 


MODIFICATION  OF  LEVIdVITA’S  WAVE  EQUATION  273 


magnetic  and  a  purely  electrostatic  field  in  some  Galilean  coordinate 
system : 

I.  We  take  Fu  *■  —Fu  //  as  the  only  non-vanishing  components 
of  Ftk’t  then  the  wave  equation  may  be  satisfied  by  taking  and  \f>* 
equal  to  aero  and 

S^*  -I-  He/fic)  -  0 

-  i{e/hc)  -  0 

which  may  be  written  as  the  matrix  equation 


[SSo  -I-  (e/hc)  HSt] 


where 


So 


1  0 
0  1 


,  and  Si 


0  i 
-i  0 


(17) 


(18) 


(19) 


and  since  St  is  Hermitean  this  gives  a  real  interaction  between  the  spin 
and  the  magnetic  field,  and  moreover  this  interaction  is  of  the  correct 
magnitude. 

II.  We  take  Fu  »  —Fi{  ^  E  sa  the  only  non-vanishing  components 
of  Fa*;  then  the  wave  equation  may  be  satisfied  by  taking  ^  and 
equal  to  lero  and 


+  He /he)  Ei*  ~  O] 
-I-  i(e/hc)  -  0  J 


which  may  be  written  as 


ISSo  -I-  i(e/hc)  ESt] 


0 


(20) 


(21) 


where 


S,  -> 


(22) 


and  since  St  is  Hermitean  this  gives  an  imaginary  interaction  between  the 
spin  and  the  field. 

(6.  Remarks.  We  have  constructed  a  wave  equation  similar  in 
content  to  that  of  Levi-Civita  but  falling  within  the  scheme  of  the 
projective  theory  of  relativity;  a  term  corresponding  to  the  arbitrary 


274 


BANESH  HOFFMANN 


term  introduced  by  Levi-Civita  has  appeared  automatically  and  with 
the  correct  coefficient,  and  the  presence  of  the  square  root  of  minus  one 
in  this  coupling  term  as  obtained  in  the  present  paper  has  obviated  the 
use  of  the  <  tensor  that  was  necessary  in  Levi-Civita’s  work  in  the 
absence  of  the  t.  From  the  point  of  view  of  any  general  unified  field 
theory  the  absence  of  the  c  is  probably  an  advantage,  but  it  should  be 
remembered  that  the  wave  equation  we  have  here  developed  does  not 
yet  form  a  natural  part  of  any  larger  group  of  field  equations,  and  its 
connection  with  the  field  equations  of  projective  relativity  as  given 
hitherto  lies  only  in  the  fact  that  it  employs  the  same  building  material 
and  with  the  same  significance.  So  far  there  does  not  seem  to  have 
been  introduced  into  the  fundamental  formalism  of  the  projective  theory 
a  vector  that  might  play  the  part  of  the  wave  vector  4>y  of  this  paper. 


A  GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 

By  H.  W.  Bode* 

ABSTRACT 

The  development  of  electric  wave  hlters  has  showrn  that  networks  of 
many  configurations  may  have  filtering  properties.  Filters  may  be 
built  as  ladder  structures,  as  lattices,  as  bridged-T’s,  or  in  a  variety  of 
combinations  of  transformers  and  ordinary  elements.  No  general 
theory  uniting  all  these  configurations  has,  however,  been  developed. 
Each  structure  is  analysed  by  a  method  adapted  primarily  for  that 
configuration  alone.  The  result  has  been  that  the  fundamental  dynam¬ 
ical  conditions  for  filtering  action  have  remained  undiscovered,  the 
relations  between  filters  of  different  configurations  have  been  left  at 
least  somewhat  obscure,  and  there  has  been  no  indication  of  the  results 
which  might  be  obtained  from  filters  of  other  configurations. 

The  first  half  of  this  paper  develops  a  general  theory  which  w’ill  answer 
these  questions  for  any  filter  transmitting  a  single  frequency  band.  It 
is  assumed  that  the  internal  configuration  of  the  filter  may  be  any 
arrangement  of  ordinary  positive  electrical  elements,  including  trans¬ 
formers.  The  theory  is  based  upon  a  combination  of  the  ordinary 
image  parameter  method  of  analysing  networks  and  the  normal  co¬ 
ordinate  method  familiar  in  the  dynamics  of  vibrating  systems.  It  is 
found  that  the  conditions  for  filtering  action  can  be  expressed  by  means 
of  relations  between  the  normal  coordinates  of  the  network  correspond¬ 
ing  to  various  systems  of  external  constraints.  The  same  normal 
coordinate  solutions  also  furnish  convenient  (Mrameters  in  terms  of 
which  general  expressions  for  the  external  characteristics  of  filters  are 
built  up. 

The  second  half  of  the  paper  develops  a  method  of  finding  a  definite 
network  corresponding  to  any  filter  characteristics  which  the  analysis 
of  the  first  half  shows  to  be  physically  possible.  The  method  followed 
is  a  generalisation  of  Zobel’s  composite  filter  analysis  of  ladder  networks. 
As  in  Zobel’s  scheme,  the  complete  filter  is  composed  of  a  number  of 
simpler  structures  in  tandem,  each  constituent  being  obtained  by  trans¬ 
formations  or  derivations  of  elementary  prototype  sections.  In  order 

*  Bell  Telephone  LaborEtories,  Inc. 


275 


276 


H.  W.  BODE 


to  obtain  all  physically  possible  characteristics,  however,  it  has  been 
necessary  to  add  two  new  section  derivations  to  the  list  described  by 
Zobel.  One,  a  complex  m-derivation,  is  an  extension  of  the  real  m- 
derivations  of  ordinary  theories.  The  other,  an  ^-derivation,  is  in  a 
sense  the  converse  of  the  m-derivation,  since  it  changes  the  image  im¬ 
pedance  without  affecting  the  transfer  constant. 

Aside  from  its  purely  theoretical  interest,  the  analysis  leads  to  two 
results  of  immediate  practical  value.  The  first  is  an  increase  in  our 
ability  to  convert  filter  designs  from  their  original  configurations  to 
others  which  may  be  more  convenient  for  purposes  of  physical  con¬ 
struction.  The  second  is  the  introduction  of  new  phase,  impedance  and 
attenuation  characteristics  by  the  complex  m-  and  /i-derived  sections. 
The  novel  phase  and  impedance  characteristics  are  particularly  impor¬ 
tant  in  improving  filter  performance.  Some  of  the  results  to  which  they 
lead  are  described  in  a  forthcoming  paper.* 

INTRODUCTION 

The  propagation  of  waves  in  uniform  media  is  one  of  the  most  familiar 
phenomena  in  physics.  Examples  are  found  in  the  transmission  of 
electromagnetic  waves  in  space  or  along  wires;  of  acoustic  waves  in  air 
or  other  substances;  in  the  vibrations  of  a  taut  uniform  string;  and  in 
the  longitudinal  vibrations  of  a  metallic  bar.  For  uni-dimensional 
propagation,  at  least,  the  form  of  the  mathematical  analysis  in  all  of 
these  cases  turns  out  to  be  very  much  the  same,  the  differences  being 
chiefly  ones  of  terminology.  We  find  in  each  case  that  the  propagation 
of  the  wave  can  be  specified  by  only  two  quantities.  One,  which  can  be 
called  the  characteristic  impedance,  represents  the  ratio  of  the  applied 
force  to  the  resulting  disturbance.  The  second,  which  can  be  called 
the  propagation  constant  or  exponent,  represents  the  attenuation,  or 
reduction  in  amplitude,  and  phase  velocity  of  the  wave  as  it  travels  dow  n 
a  unit  length  of  the  medium. 

Since  w’ave  filters  are  constituent  parts  of  communication  systems, 
which  themselves  may  be  regarded  as  media  for  the  propagation  of 
waves,  the  theory  just  outlined  is  the  natural  foundation  upon  which 
wave  filter  theory  is  built.  In  several  respects,  however,  filter  theory 
and  this  classical  propagation  theory  are  diametrically  opposed.  The 
differences  are  particularly  important  for  the  purposes  of  this  paper 
since  they  give  rise  to  the  distinctive  problems  of  filter  theory. 

*  *‘IdeaI  Filters”  by  R.  L.  Dietzold  and  H.  W.  Bode,  Bell  System  Technical 
Journal,  Jan.  1935. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


277 


Perhaps  the  primary  distinction  between  the  classical  theory  and 
filter  theory  is  merely  one  of  point  of  view.  In  the  classical  theory,  the 
central  conception  is  that  of  non-dispersive  wave  propagation.  In  other 
words,  in  this  theory  an  ideal  medium  is  one  along  which  a  disturbance 
of  any  sort  travels  with  undistorted  wave  form.  We  know,  of  course, 
that  this  ideal  is  not  always  achieved,  even  for  the  media  cited  in  the 
opening  paragraph.  The  wave  form  of  an  arbitrary  signal,  such  as  a 
telegraph  dot,  for  example,  may  change  radically  in  transmission  down  a 
long  cable.  The  dispersion  and  selective  absorption  exhibited  by  glass 
and  similar  media  furnish  corresponding  illustrations  in  the  optical 
field.  In  the  classical  theory,  however,  the  fact  that  the  wave  form  is  at 
least  approximately  preserved  in  many  instances,  is  the  phenomenon  of 
greatest  interest.  The  irregularities  manifested  by  actual  media  are 
regarded  as  annoying  lapses  from  the  perfect  simplicity  of  the  picture. 
They  cannot  be  ignored,  of  course,  and  they  may  even  be  of  some 
interest  on  their  own  account,  but  at  least  they  do  not  constitute  the 
principal  objective  of  study. 

On  the  other  hand,  in  network  theory,  of  which  filter  theory  is  an 
important  part,  these  “imperfections”  are  our  primary  concern. 
Nothing  could  interest  us  less  than  a  medium  which  made  no  change  in 
the  nature  of  the  signal.*  Instead,  our  objective  is  to  control  distortion 
and  make  it  serve  useful  ends.  Equalizers,  for  example,  are  used  when 
it  is  necessary  to  introduce  distortion  in  order  to  compensate  for  the 
distortion  unavoidably  produced  by  other  apparatus  in  the  system. 
Filters  are  used  when  it  is  necessary  to  eliminate  certain  frequency 
ranges  in  order  to  separate  desired  from  undesired  signals  traveling 
down  the  same  pair  of  wires,  to  prevent  interference  from  noise,  or 
perhaps  to  economize  frequency  space  in  carrier  systems.  The  filter  is 
not  intended  to  introduce  distortion  in  the  frequency  ranges  where  the 
desired  signal  lies.  In  so  far  as  the  undesired  portions  of  the  frequency 
spectrum  are  suppressed,  however,  the  wave  which  appears  at  its  output 
fails  to  resemble  that  at  its  input  and  the  structure  as  a  whole  must  be 
regarded  as  a  distorting  one. 

This  difference  in  the  objectives  of  filter  theory  and  the  classical  theory 
can  be  conveniently  exemplified  by  their  attitudes  toward  the  concep¬ 
tions  of  characteristic  impedance,  attenuation  and  phase  velocity.  In 
the  classical  theory,  these  quantities  are  frequently  regarded  as  con¬ 
stants,  since  they  approach  constancy  as  the  dispersive  properties  of 

*  A  pouible  exception  may  be  found  in  the  "delay  networks"  used  for  some 
special  purposes  in  the  telephone  plant. 


h 


278 


H.  W.  BODE 


the  medium  decrease.  In  network  theory,  on  the  other  hand,  they  are 
definitely  functions  of  frequency,  and  their  functional  dependence  on 
frequency  is  kept  constantly  before  us,  since  it  is  just  this  variation 
which  we  wish  to  control. 

Closely  correlated  with  this  difference  in  the  objectives  of  the  two 
theories  is  another  whose  relation  to  the  particular  problems  of  filter 
theory  is  more  immediate.  In  the  classical  theory,  the  medium  which 
we  study  is  uniform  and  continuous.  As  we  have  already  seen,  uniform 
continuous  media  may,  on  occasion,  exhibit  an  appreciable  amount  of 
distortion.  The  distortion  obtained  from  such  media,  however,  is  not 
readily  controllable.  In  the  practical  construction  of  filters  and  similar 
structures,  therefore,  we  turn  to  the  more  flexible  artificial  media 
represented  by  networks  of  discrete  inductances,  capacities  and  resist¬ 
ances.  Now,  since  any  such  network  is  simply  a  vibrating  system,  the 
highly  developed  methods  of  particle  dynamics,  applicable  to  such 


rroip - 

— nflJlp — 

— — 

- - 

=  = 

=  - 

=  = 

Fio.  1.  Early  type  of  low-pasa  filter 


systems,  can  be  employed  to  study  it.  They  lead  to  a  characterization 
of  the  network  in  terms  of  its  natural  frequencies  or  “normal  coor¬ 
dinates.”  For  many  purposes,  this  result  is  just  what  we  are  looking 
for.  It  does  not,  however,  indicate  very  clearly  the  distorting  proper¬ 
ties  of  the  structure  when  it  is  used  as  a  medium  for  wave  transmission. 
In  spite  of  the  discrete  nature  of  the  elements  which  compose  the 
medium  therefore,  it  is  still  convenient,  in  filter  theory,  to  use  terms, 
such  as  characteristic  impedance  and  propagation  constant,  which  were 
originally  developed  for  uniform  media.  To  this  combination  of  a 
medium  of  one  sort  and  a  method  of  analysis  developed  originally  for  a 
medium  of  quite  a  different  sort  may  be  traced  most  of  the  distinctive 
features  of  filter  theory. 

The  contrast  between  the  results  of  the  application  of  the  two  methods 
to  networks  of  discrete  elements  was  clearly  exemplified  even  in  the 
earliest  filters.  The  first  filter  configuration  to  be  discovered  is  that 
shown  by  Fig.  1.*  Long  before  the  filtering  properties  of  the  structure 

*  Discoverrd  by  O.  A.  Campbell.  A  physirsl  theory  of  this  conhipiration, 
applicable  also  to  a  number  of  other  configurations,  is  given  by  Campbell  in  the 
Bell  System  Technical  Journal  for  Nov.  1922. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


279 


were  known,  an  exactly  equivalent  mechanical  arrangement,  consisting 
of  a  weightless  string  loaded  with  equally  spaced  beads,  had  been  a 
classic  problem  in  the  theory  of  vibrating  systems.  The  methods  of 
ordinary  dynamics  led  to  the  result  that  when  the  ends  of  the  string  are 
6xed,  the  structure  is  characterised  by  natural  frequencies  proportional 
to  the  sines  of  a  number  of  equally  spaced  angles.  This  result  was  a 
useful  one,  since  from  it,  by  taking  the  extreme  case  when  the  number 
of  beads  is  indefinitely  increased,  we  could  readily  conclude  that  the 
natural  frequencies  of  a  uniform  string  are  harmonic. 

It  did  not,  however,  exhibit  very  clearly  the  filtering  properties  of  the 
system,  which  therefore  went  unnoticed.*  Those  properties  were  only 


Fio.  2.  Structure  of  Fig.  I  aertionaliied  at  mid-aeriea  points 


Fio.  3.  Structure  of  Fig.  1  aectionaliied  at  mid-ahunt  points 


noticed  when  the  ideas  and  methods  of  particle  dynamics  were  discarded 
in  favor  of  those  developed  for  continuous  media.  The  use  of  these  last 
methods  is  facilitated  if  we  assume  that  the  structure  of  Fig.  1  extends 
indefinitely  in  both  directions  and  has  been  broken  up  into  “sections" 
by  inserting  division  points  periodically  at  the  centers  of  the  series  or 
shunt  branches,  as  shown  by  Figs.  2  and  3.  It  is  then  found  that, 
provided  measurements  are  made  only  at  the  end  points  of  the  sections, 
the  network  can  be  replaced  by  an  appropriate  one-dimensional  con- 

*  These  properties  hsd  sctuslly  been  discovered  considerably  before  CampbeU’s 
work  on  electrical  structures.  See,  for  example,  Tait,  Ency.  Brit.  XXIV.  Art. 
“Wave,”  pp.  417-418,  1889;  or  Routh,  “Advanced  Rigid  Dynamics,”  pp.  254-260, 
1892.  A  later  article  by  C.  A.  Godfrey  (Phil.  Mag.,  Apr.  1898)  may  also  be  men¬ 
tioned.  The  analysis,  however,  was  more  nearly  related  to  the  dynamics  of 
continuous  media  and  to  the  methods  later  employed  by  Campbell  for  electrical 
networks'than  it  was  to  the  dynamics  of  particles. 


280 


H.  W.  BODE 


tinuoufl  medium,  the  “equivalent  line,"  along  which  the  division  points 
corresponding  to  the  sections  of  the  network  are  equally  spaced.  A 
study  of  the  characteristic  impedance  and  propagation  constant  of  this 
equivalent  line  shows  that  steady-state  sinusoidal  waves  of  low  fre¬ 
quencies  travel  freely  down  the  line  while  waves  of  higher  frequencies 
suffer  a  steady  reduction  in  amplitude  as  they  proceed.  The  reduction 
in  amplitude  secured  from  a  single  section  is  shown  by  Fig.  4.  The 
structure  of  Fig.  1  is,  therefore,  a  low-pass  filter  and  can  be  used  to 
separate  waves  of  low  frequency  from  waves  of  higher  frequency.  By 
traveling  sufficiently  far  down  the  line,  the  separation  can  be  made  as 
complete  as  we  desire. 


♦n 


Fia.  4.  Ratio  of  current  amplitudes  at  low-pass  filter  section  terminals 

In  practice,  of  course,  it  is  not  possible  to  work  with  a  structure  of 
more  than  finite  length  so  that  the  assumption  that  the  network  con¬ 
tinues  indefinitely  is  not  a  valid  one.  We  find,  however,  that,  if  the 
terminating  impedances  are  well  chosen,  the  operation  of  even  a  rather 
short  line  of  this  t3rpe  may  be  made  to  approximate  fairly  closely  the 
idealised  conditions  which  have  been  described. 

In  the  network  of  Fig.  1  the  parentage  of  filter  theory  in  the  theory  of 
continuous  media  is  apparent.  The  structure  differs  from  an  ideal 
transmission  line  only  in  the  fact  that  its  series  inductances  and  shunt 
capacities  are  lumped,  instead  of  being  uniformly  distributed.  As  the 
practical  utility  of  filters  in  communication  engineering  became  more 
apparent  attempts  were  made  to  find  other  circuits  having  more  or  less 
similar  filtering  properties.  The  earliest  configurations  to  be  discovered, 
however,  were  still  of  this  recurrent  ladder  or  series-shunt  type  and 
could  be  derived  from  the  first  by  easy  mathematical  analogies.  For 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  ’  281 

example,  since  the  product  of  the  series  and  shunt  impedances  in  the 
structure  of  Fig.  1  is  independent  of  frequency,  its  characteristic  im¬ 
pedance  and  propagation  constant  depend  only  upon  the  ratio  of  the 
two  impedances.  By  choosing  other  series  and  shunt  impedances 
whose  ratios  vary  in  other  ways  with  frequency,  other  varieties  of  filters 
belonging  to  this  so-called  constant-A;  t3rpe  can  be  found.  Thus  in  the 
structure  of  Fig.  5  the  ratio  of  the  series  and  shunt  impedances  as  a 
function  of  frequency  is  just  the  reciprocal  of  the  ratio  obtained  from 
the  structure  of  Fig.  1.  This  new  network  is,  therefore,  a  high-pass 
filter.  A  band-pass  filter  obtained  from  similar  considerations  is  shown 


Fig.  6.  Early  type  of  band-pass  filter 


by  Fig.  6.  In  all  of  these  structures  the  attenuation  (defined  as  the 
negative  logarithm  of  the  current  ratio  exemplified  by  Fig.  4)  rises 
monotonically  as  we  move  away  from  the  band  of  transmitted  frequen¬ 
cies.  The  characteristic  of  the  structure  of  Fig.  1,  for  example,  is  given 
symbolically  by  Fig.  7-A. 

A  much  greater  advance  was  made  by  O.  J.  Zobel’s  recognition,  in 
1919,  of  the  existence  of  large  classes  of  filters  having  identical  char¬ 
acteristic  impedances  and  his  development  of  certain  formulae  by 
means  of  which  all  the  various  members  of  the  class  corresponding  to  a 
single  “prototype”  could  be  obtained  by  merely  varying  a  parameter 
m.‘  Since  all  the  members  of  any  such  “m-derived”  class  have  identical 
characteristic  impedances,  it  is  obviously  possible  to  unite  two  such 

*  “Theory  and  Design  of  Uniform  and  Composite  Electric  Wave  Filters,”  Bell 
System  Technical  Journal,  Jan.  1923. 


282 


H.  W.  BODE 


structures  in  tandem  without  introducing  reflection  effects.  Their 
propagation  constants,  on  the  other  hand,  may  vary  widely  from  one 
another;  as,  for  example,  is  the  case  with  the  structures  of  Figs.  8  and  0, 
which  are  m-derived  from  Fig.  1  but  have  attenuation  characteristics 
of  the  form  shown  by  Fig.  7-B.*  By  placing  such  structures  in  tandem 


Fio.  7.  Symbolic  attenuation  characteriatica  of  low-paaa  filtera 


Fio.  8.  Mid-seriea  m>derived  type  low-paaa  filter 


with  one  another  or  with  the  original  prototype  structure,  therefore, 
it  was  possible  to  obtain  “composite  filters”  having  a  much  wider  variety 

*  The  frequency  of  infinite  attenuation  exhibited  by  the  characteristic  in 
Fig.  7-B  depends  on  the  value  of  m.  By  choosing  m  suitably,  it  can  be  placed  any¬ 
where  between  the  cut-off  and  infinite  frequency.  When  it  is  put  at  infinite 
frequency,  we  obtain  the  “prototype”  structure  of  Fig.  1. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


283 


of  attenuation  characteristics  than  could  be  obtained  with  the  prototype 
structures  alone. 

The  filters  to  which  this  desifpi  method  led  were  not  only  an  im¬ 
portant  step  forward  practically,  but  they  also  represented  an  important 
departure  from  theoretical  tradition,  for  they  were  no  longer  recurrent 
structures  that  could  be  thought  of  as  parts  of  simple  infinite  chains. 
This  tendency  toward  irregularity  of  structure  has  been  even  more 
noteworthy  in  the  still  more  recent  discoveries  of  networks  having 
filtering  properties.  Some  of  these  are  still  of  the  ladder  type.^  It  has 
also  been  found,  however,  that  a  number  of  other  configurations  are 
capable  of  giving  filter  characteristics.  Chief  among  these  is  the 
lattice  or  Wheatstone  bridge.'  In  its  simplest  forms,  the  lattice  fur¬ 
nishes  characteristics  similar  to  those  obtained  from  the  previous 
ladder  structures.  As  the  number  of  elements  in  the  lattice  is  increased, 
however,  many  much  more  complicated  impedance  and  transmission 
characteristics  can  be  secured.  Of  a  wide  variety  of  other  recently 
developed  filter  structures,  the  bridged-T  section  described  by  Baer- 
wald,'  and  the  work  done  by  Jaumann,‘*  Baerwald*  and  others  on 

^  See,  for  example,  O.  J.  Zobel’s  “Multiple  .W-Derived  Structures”  in  the  Bell 
System  Technical  Journal,  Apr.  1931. 

*  The  introduction  of  lattice  type  filter  sections  can  be  traced  to  Campbell  (see 
article  referred  to  above).  A  number  of  particular  structures  were  also  dis¬ 
covered  by  K.  S.  Johnson  (U.  S.  Pat.  1501667)  and  A.  C.  Bartlett  (Brit. 
Pat.  253629).  The  general  lattice  analysis,  referred  to  here,  appears  to  have  been 
developed  independently  and  in  approximately  the  same  terms  by  W.  Cauer  and 
the  author.  Cauer’s  work  is  described  in  “Siebschaltungen,”  VDI — Verlag., 
Berlin,  1931,  and  is  reviewed  in  later  papers  by  Cauer  in  “Physics,”  and  by  Guille- 
min  in  this  journal.  The  author's  work  is  given  in  U.  S.  Pat.  1828454,  filed 
July  3,  1930.  In  both  cases,  presumably,  the  actual  work  was  done  considerably 
before  the  dates  given.  In  this  connection  a  brief  dicussion  by  H.  G.  Baerwald 
in  Sitzs.-Ber.  Preuss.  Ak.  d.  Wiss.,  1931,  may  also  be  mentioned. 

The  lattice  analysis  is  of  particular  interest  for  the  purposes  of  the  present 
paper,  both  because  this  particular  configuration  is  used  largely  in  the  later  stages 
of  the  paper  to  show  that  a  physical  embodiment  of  certain  of  the  predicted  char¬ 
acteristics  can  be  secured,  and  because  the  use  of  the  resonances  and  anti-reaso- 
nances  of  the  branches  of  the  lattice  to  specify  its  characteristics  is  similar  to, 
although  somewhat  more  particularized  than,  the  use  made  of  the  natural  fre¬ 
quencies  of  the  general  network  in  the  discussion  which  follows.  Because  of  this 
analogy  and  the  fact  that  so  much  has  already  been  written  about  the  lattice,  the 
discussion  of  the  possible  expressions  for  the  impedance  and  transmission  char¬ 
acteristics  in  terms  of  natural  frequencies,  which  would  otherwise  be  of  great 
interest,  is  given  only  in  condensed  form. 

'  Loe.  cit. 

«•  E.  N.  T.,  1932,  Heft  7. 


1 


284 


H.  W.  BODE 


^tering  circuits  obtained  by  associating  transformers  in  various  ways 
witli  standard  elements,  inay  also  be  mentioned. 

These  developments  have  considerably  increased  the  scope  of  the 
filter  art.  Their  very  diversity,  however,  raises  an  interesting  and 
difficult  question.  Broadly  speaking,  it  appears  from  the  survey  we 
have  just  made  that  the  filter  circuits  are  more  or  less  isolated  phenom¬ 
ena.”  h^ch,  in  other  words,  represents  the  independent  discovery  of  a 
particular  ph3rsical  configuration  whose  filtering  properties  could  be 
established  by  direct  inspection.  The  progress  of  the  filter  art  has 
ap()eared  to  be,  in  effect,  a  series  of  such  happy  accidents,  each  adding 
its  own  individual  contribution  to  the  accumulation  of  devices  at  the 
dis|x)Hal  of  the  filter  engineer.  We  are  left  quite  in  the  dark  as  to  the 
numl)er  of  such  fortunate  occurrences,  with  consequent  increases  in  the 
variety  of  characteristics  available,  which  may  be  expected  in  the 
future  as  the  infinity  of  |x)ssible  network  configurations  is  examined 
one  by  one. 

So  long  as  the  known  filter  circuits  were  derived  from  recurrent 
ladder  structures,  this  characteristic  of  filter  theory  raised  no  embarass- 
ing  questions.  It  was  always  possible  to  assume  that  the  ladder  con¬ 
figuration  was  {)articularly  assigned  by  a  beneficent  providence  to 
provide  communication  engineers  with  the  filters  they  needed.  By 
examining  the  various  combinations  of  series  and  shunt  branches,  then, 
it  could  be  expected  that  the  useful  types  of  filters  would  be  discovered 
with  reasonable  rapidity.  The  astonishing  variety  of  configurations 
which  have  been  discovered  to  have  filtering  properties  in  recent  years, 
however,  makes  it  evident  that  filtering  action  by  no  means  inheres  in 
any  particular  physical  arrangement.  We  are  left  face  to  face  with  two 
problems.  The  first  is  that  of  trying  to  establish  some  general  property 
which  distinguishes  Alters  of  all  configurations  from  general  networks. 
The  second  is  that  of  attempting  to  describe  the  most  general  char¬ 
acteristics  which  can  be  obtained  from  filters  when  the  restriction  to 
particular  physical  configurations  is  abandoned.  Since  mere  existence 
theorems  are  of  little  value  in  building  a  communication  plant,  these 
two  problems  may  well  be  complemented  by  a  third,  that  of  finding  some 

”  Lattice  type  filters  are  entitled  to  somewhat  more  favorable  mention  than 
this  paragraph  might  suggest  because  of  the  fact  that  it  can  be  shown  that  a  lattice 
structure  can  be  found  which  will  be  equivalent  to  any  symmetrical  network. 
This  still  leaves  us  without  information  on  the  general  case  of  unsymmetricsl 
neta'orks,  however,  and  in  addition  all  of  our  general  criticisms  of  the  demonstra¬ 
tion  of  filter  properties  by  direct  mesh  computations  hold  for  the  lattice  as  well 
as  for  other  structures. 


i 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


285 


feasible  method  of  constructing  networks  furnishing  any  of  the  char¬ 
acteristics  which  the  answers  to  the  first  two  questions  show  to  be 
physically  possible. 

It  is  this  three-fold  problem  to  which  this  paper  is  devoted.  In  order 
to  make  the  answers  as  satisfactory  as  possible,  the  theory  is  developed 
in  a  form  which  is  independent  of  the  physical  configuration  of  the  net¬ 
work.  In  other  words,  the  stnicture  to  be  studied  is  visualised  merely 
as  a  network  with  accessible  input  and  output  terminals,  the  interior 
structure  being  any  arbitrary  arrangement  of  positive  inductances  and 
capacities  together  with  physically  realisable  systems  of  mutual  induc¬ 
tance.**  Our  first  problem  is  that  of  determining  the  conditions  which 
must  be  satisfied  by  such  a  general  network  if  it  is  to  be  a  filter  at  all.** 
From  this  foundation,  expressions  for  the  most  general  external  char¬ 
acteristics  of  the  structure,  and  a  routine  for  obtaining  a  definite  physical 
embodiment  corresponding  to  any  such  characteristics,  are  developed 
in  due  course. 

The  paper  falls  naturally  into  two  halves.  The  first  half  is  concerned 
with  the  first  two  of  the  problems  mentioned  above.  The  solution 
found  for  these  problems  amounts,  broadly  speaking,  to  a  reconciliation 
of  the  wave  method  of  analyzing  networks  with  the  nomud  coordinate 
method  of  ordinary  vibration  mechanics.  As  we  have  already  seen,  a 
single  normal  coordinate  solution  of  the  network  throws  no  light  upon 
the  properties  of  the  structure  considered  as  a  medium  for  the  transmis¬ 
sion  of  waves.  By  making  a  set  of  such  solutions,  corresponding  to 
various  constraints  imposed  upon  the  system  at  its  accessible  terminals, 
however,  the  wave  properties  of  the  structure  can  be  determined.  The 
method  is  a  convenient  one  both  because  the  normal  coordinate  analysis 
is  quite  general  and  because  it  is  so  well  understood  that  the  develop¬ 
ment  of  filter  theory  per  se  can  be  much  abridged.  It  has  the  additional 
advantage  that  the  normal  coordinates  thus  determined  make  very 
convenient  parameters  in  the  general  formulae  for  the  external  char¬ 
acteristics  of  the  structure. 

The  second  half  of  the  paper,  which  deals  with  the  problem  of  finding 
definite  physical  embodiments  for  filters,  is,  in  many  respects,  a  generali- 

The  reatriction  to  reactive  networks  implied  here  will  be  shown  later  to  be  a 
consequence  of  the  fact  that  if  the  network  is  to  exhibit  the  continuous  band  of 
free  transmission  characteristic  of  filters  it  cannot  contain  elements  which  dis¬ 
sipate  energy. 

"  For  the  purposes  of  this  paper  a  ^'filter”  will  be  defined  as  a  structure  trans¬ 
mitting  a  single  continuous  frequency  band.  The  exact  definition  is  based  upon 
the  usual  image  parameters  and  will  be  discussed  in  more  detail  later. 


2m 


H.  W.  BODE 


sation  of  2^ber8  composite  filter  method.  The  discussion  of  the  first 
half  closes  by  showing  that  all  possible  filter  characteristics  can  be 
realised  by  a  combination  of  certain  elementary  networks  in  tandem, 
provided  physical  structures  of  the  required  types  can  be  found.  The 
required  elementary  structures  can  be  divided  broadly  into  symmetrical 
and  unsymmetrical  types  and  are  taken  up  in  succession  in  the  second 
half  of  the  paper.  In  each  case,  it  is  found  that  the  elementary  struc¬ 
tures  can  be  built  in  a  variety  of  forms.  As  far  as  possible,  the  set  which 
is  given  has  been  made  up  of  familiar  structures.  In  order  to  complete 
the  list,  however,  it  has  been  necessary  to  include  also  one  or  two  con¬ 
figurations  not  previously  described. 

While  the  paper  is  primarily  concerned  with  theoretical  questions,  it 
has  also  a  number  of  interesting  and  important  practical  by-products. 
One  of  these,  the  discovery  of  new  types  of  sections  having  novel  char¬ 
acteristics,  has  already  been  mentioned.  A  second  has  to  do  with  an 
increase  in  our  grasp  of  the  relations  between  familiar  types  of  struc¬ 
tures,  such  as  those  between  the  ladder  and  the  lattice  or  between  two 
lattice  structures  of  different  degrees  of  complexity.  A  number  of 
relations  between  familiar  tvpes  of  sections  are  already  known,  but  for 
many  purposes  our  knowledge  is  too  incomplete  to  be  of  great  service. 
The  elementary  structures  which  make  up  the  general  composite  filter, 
however,  furnish  a  sort  of  “common  denominator”  by  means  of  which 
known  sections  of  various  kinds  can  be  effectively  compared  and  related 
to  one  another.  The  result  is  of  practical  importance  both  because  it 
allows  us  to  apply  the  experience  and  knowledge  gained  with  filters  of 
one  type  to  filters  of  other  types  and  because  the  various  familiar  con¬ 
figurations  are  by  no  means  alike  in  the  ease  and  economy  with  which 
they  can  be  constructed  physically.  With  the  help  of  the  present  theory, 
we  can  readily  convert  filter  designs  from  one  physical  configuration  into 
others  which  may  be  more  suitable  for  purposes  of  practical  construction. 
These  possibilities  are  discussed  in  more  detail  in  the  concluding  section 
of  the  paper. 

One  further  question  should  be  mentioned.  It  was  suggested  in  an 
introductory  paragraph  that  the  distortionless  properties  of  ideal  con¬ 
tinuous  media  were  a  result  of  certain  uniformities  in  their  characteristic 
impedances  and  propagation  constants  as  functions  of  frequency.  The 
inherent  selective  properties  of  a  filter  obviously  prevent  it  from  meeting 
these  criteria  in  all  parts  of  the  frequency  spectrum.  It  is  natural,  how¬ 
ever,  to  suggest  that  an  ideal  filter  should  be  one  whose  characteristics 
resemble  those  of  the  ideal  continuous  medium  as  closely  as  is  consistent 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


287  • 


with  filtering  action.  In  other  words,  the  ideal  filter  should  be  a  struc¬ 
ture  which  has  the  same  characteristics  as  the  continuous  medium  in 
the  range  of  transmitted  frequencies  and  completely  eliminates  all  other 
frequencies.  The  early  ladder  type  filters  failed  to  meet  these  require¬ 
ments  by  a  wide  margin.'*  Some  of  the  ideal  characteristics  can  be 
approximated  more  closely  by  the  use  of  recently  developed  filter 
structures  or  by  the  use  of  auxiliary  correcting  networks,  but  the  solu¬ 
tion  remains  incomplete.  A  solution  giving  as  close  an  approximation 
as  we  please  to  the  ideal  requirements  by  the  use  of  a  sufficient  number 
of  elements  has,  however,  been  developed  by  means  of  the  analysis  in 
the  present  paper.  It  will  be  described  in  a  forthcoming  paper  by 
R.  L.  Dietsold  in  collaboration  with  the  author.** 

PART  I.  PRELIMINARY  ANALYSIS 

The  general  filter  circuit  with  which  we  will  deal  is  represented 
schematically  by  the  box  with  numbered  terminals  in  Fig.  10.  The 


Fia.  10.  General  filter  rircuit 


configuration  of  the  network  within  the  box  is  supposed  to  be  unknown. 
We  will  assume,  however,  that  it  is  made  up  of  some  combination  of 
ordinary  positive  inductances  and  capacities.  Mutual  inductance 
linkages  between  the  coils  will  be  permitted  but  it  will  be  assumed  that 
they  are  not  larger  than  could  be  obtained  physically  with  the  given 
self-inductances.'*  The  external  circuits  to  which  the  filter  is  connected 
have  been  represented  by  the  generators  Ei  aitf  Ei  in  series,  respectively, 
with  the  impedances  Z\  and  Zt  in  Fig.  10.  We  will  assume  for  purposes 
of  analysis  that  these  four  quantities  can  be  varied  at  will,  although  the 

'*  See,  for  example,  the  diBcuuion  in  “Impedance  Correction  of  Wave  Filters’’ 
by  E.  B.  Payne  in  the  Bell  System  Technical  Journal,  Oct.  1930,  and  in  papers  by 
Lane,  Steinberg,  and  Nyquist  and  Brand  in  the  Trans,  of  the  A.  I.  E.  E.  for 
May,  1930. 

■*  Bell  System  Technical  Journal,  Jm.  1935. 

**  That  is,  the  energy  of  the  magnetic  field  must  be  positive  for  any  choice  of 
the  instantaneous  currents  in  the  coils. 


288 


H.  W.  BODE 


other  portions  of  the  complete  network  will  naturally  be  inaccessible. 
In  practice,  of  course,  the  external  circuits  to  which  the  filter  is  con¬ 
nected  may  not  have  this  simple  form.  Since  Th4venin’s  theorem” 
allows  us  to  replace  any  linear  external  circuit  at  a  single  frequency  by  a 
generator  in  series  with  an  impedance,  however,  there  is  no  essential 
loss  in  generality  in  representing  the  external  circuits  in  the  manner 
shown  by  the  figure. 


Me»h  Equation  Formulation 

Whatever  the  internal  configuration  of  the  network  may  be,  it  can,  of 
course,  be  represented  formally  by  a  set  of  mesh  equations  of  the  usual 
sort.  If  we  assume  that  k  represents  the  total  number  of  meshes  in  the 
circuit,  and  that  the  meshes  have  been  so  chosen  that  the  first  mesh  is 
the  only  one  which  passes  through  the  input  terminals  and  the  second 
mesh  the  only  one  which  passes  through  the  output  terminals,  the  result 
can  be  written  as 

(p^i  +  ®ii)  Ii  +  au/i  -♦-•••  +  o\kIk  *  p^i 

oii/i  -f-  (pZ|  ojt)  /i  +  aulk  “  pEt 

aii/i  -f"  o|j/j  -1-  .  .  .  au/»  *  0 

. -  0  (1) 

. -  0 

a*i/i  +  aicilt  -f-  .  .  .  -1-  akkih  *  0 

where  p  —  *2*/  and  a<,  —  a„  »  p*L„  -1-  The  variable  pis  intro¬ 

duced,  in  accordance  with  the  usual  convention,  in  the  process  of  repre¬ 
senting  the  actual  sinusoidal  voltages  and  currents  in  the  network  by 
complex  exponentials  of  the  form  EjC**  and  where  the  E'»  and  /’s 
are  some  real  or  copiplex  constants.  In  practical  applications,  the 
frequency  will,  of  course,  be  a  real  quantity  and  p  will  therefore  be  a 
pure  imaginary.  For  purposes  of  analysis,  however,  we  will  assume 
that  p  is  a  complex  variable.  We  will  also  find  it  convenient,  in  most 
circumstances,  to  assume  that  the  p*  which  appears  in  the  general  im¬ 
pedance  coefficient  o„  has  been  replaced  by  a  new  variable,  X. 

In  equations  (1)  /| . . .  Ik,  which  flow  in  the  concealed  meshes,  are 
not  of  interest  and  can  be  eliminated  in  terms  of  /|  and  /i  by  using  the 
last  k-2  of  the  set  of  equations.  The  elimination  is  facilitated  by  the 
fact  that  since  the  coefficients  of  the  original  equations  are  symmetrical 

'*  Comptes  RenduB,  Vol.  97,  p.  159,  or  K.  8.  Johnson,  “Transmission  Circuits 
for  Telephonic  Communication,”  p.  87. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


289 


we  can  make  use  of  the  well-known  theorem  that  in  any  83rmmetrical 
determinant,  A, 

(2) 

where  An  ia  the  minor  of  the  term  in  the  first  row  and  first  column  and 
the  other  quantities  have  a  similar  significance.  We  shall  take  A  as  the 
determinant  of  the  coefficients  of  (1)  when  Zt  »  Zi  0.  It  is  then 
easy  to  show  that  the  set  of  equations  can  be  reduced  to 


The  Image  Parameters  of  the  Network 


This  last  pair  of  equations  expresses  the  performance  of  the  structure 

by  the  three  quantities  and  Instead  of  using  these 

three  quantities,  we  can,  of  course,  specify  the  structure  by  any  other  set 
of  three  derived  from  them.  For  our  purposes  we  shall  adopt  the 
customary  image  parameters,  Z/„  Z/,  and  d.  In  the  usual  fashion 
Z/i  and  Z/|  may  be  defined  as  the  impedances  with  which  the  structure 
must  be  terminated  so  that  the  impedances  seen  in  both  directions  from 
the  network  at  each  pair  of  terminals  are  the  same,  as  shown  by  Fig.  11, 
while  6  is  the  natural  logarithm  of  the  ratio  of  the  volt-amperes  flowing 
into  and  out  of  the  network  under  these  conditions.  From  this  defini¬ 
tion  follows  the  well-known  relation  that  if  two  structures  are  connected 
in  tandem  with  equal  image  impedances  at  their  common  junctions,  as 
shown  by  Fig.  12,  the  image  impedances  of  the  resulting  four-terminal 
network  will  be  the  same  as  the  image  impedances  at  the  free  ends  of  the 
constituent  networks,  while  the  transfer  constant  of  the  resulting  net¬ 
work  will  be  the  sum  of  the  transfer  constants  of  the  constituents. 

Expressions  for  Z/„  Z/,  and  6  in  terms  of  the  network  determinant  and 
its  minors  are  easily  found  from  their  definitions  and  equations  (3).  For 
example,  if  we  set  Et  »  0,  Zt  «  Z/„  Zi  «  Z/„  the  impedance  at  the  in¬ 
put  terminals  must,  by  definition,  be  Z/„  while  the  ratio,  Ex/h  represent¬ 
ing  the  impedance  of  the  network  plus  that  of  the  generator,  must  be 
2Zf|.  Substitution  in  equations  (3)  therefore  gives: 


pEi 

/i 


2pZf|  —  pZ/j  -|-  - — 


(4) 


290 


H.  W.  BODE 


A  similar  equation  is  obtained  by  assuming  that  the  generator  is  in  the 
output  circuit  and  considering  the  impedance  looking  into  the  output 
terminals.  When  the  two  equations  are  solved  simultaneously  with  the 
help  of  (2),  we  obtain 


The  transfer  constant  can  now  be  found  by  substituting  Zj,  and  Z/| 
for  Z|  and  Zt  in  (3),  setting  £%  «  0,  and  solving  for  the  ratio  between  the 


Fio.  11.  Network  terminated  in  image  impedances 


Fia.  12.  Composite  network  with  matched  image  impedances 


volt-amperes  at  the  output  and  input  terminals.  This  gives 


All  Ajj 


which  can  be  somewhat  more  conveniently  written  as 


All  An 


Since  the  input  impedance  of  the  network,  as  determined  from  equa¬ 
tions  (1),  is  A/pAii  when  the  output  terminals  are  short-circuited  and 


,1,  ] 

A 

V 

?* 

2* 

! 

i 

e 

- - 2,j.2l,  - - 

e' 

f' 

_ 1 

1  °  1 

L _ 

2' 

GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


291 


is  Att/pAiin  when  they  are  open-circuited,  these  equations  can  also  be 
written  in  the  general  fonn:“ 

Z/  »  Zbc  '^oc  t  tanh  6  ™  .  (8) 

Most  of  our  fornial  analysis  will  be  based  upon  the  expressions  given  in 
(5)  and  (7).  The  formulae  given  by  (8)  will,  however,  frequently  be 
convenient  for  descriptive  purposes. 

The  image  parameters  thus  defined  may  be  looked  upon  as  an  exact 
formulation  of  the  general  conception,  suggested  in  the  introduction, 
that  the  analysis  of  networks  of  lumped  elements  can  frequently  be 
made  most  conveniently  by  means  of  the  terminology  and  methods 
developed  for  continuous  media.  From  this  point  of  view,  the  image 
impedance  and  transfer  constant  are  simply  new  names  for  the  char¬ 
acteristic  impedance  and  propagation  constant  of  a  continuous  medium 
whose  external  performance  at  the  frequency  in  question  is  identical 
for  all  boundary  conditions  with  that  of  the  network.  If  the  network  is 
symmetrical,  the  identification  of  the  image  impedance  with  the  char¬ 
acteristic  impedance  and  of  the  transfer  constant  with  the  propagation 
constant  is  perfect.  The  necessity  of  using  two  image  impedances  in 
the  general  case  arises  simply  from  the  fact  that  since  a  general  network 
will  be  unsymmetrical  its  “characteristic  impedances”  in  the  two 
directions  will  be  different. 

Granted  this  analogy,  we  may  look  upon  the  image  impedance  and 
transfer  constant  as  furnishing  rough  measures  of  the  actual  impedance 
and  transmission  characteristics  of  the  filter  under  normal  conditions  in 
exactly  the  same  way  that  the  characteristic  impedance  and  propagation 
constant  of  a  continuous  medium  furnish  rough  measures  of  the  char¬ 
acteristics  to  be  expected  from  it.  In  both  cases  the  measures  will  be 
exact  if  the  terminating  impedances  happen  to  be  equal  to  the  char¬ 
acteristic  or  image  impedances  of  the  structure.  Otherwise,  they 
must  be  corrected,  to  a  greater  or  less  extent,  to  take  account  of  “reflec¬ 
tion  effects”  at  the  boundaries.  In  practical  filter  design  these  correc¬ 
tions  must,  of  course,  be  taken  into  consideration.  Since  they  are 
usually  relatively  small,  however,  and  since  the  subject  is  at  any  rate  a 

**  Ek|u«tiona  (8)  are  also  given  by  K.  8.  Johnson,  loc.  cit.  p.  84,  where  they  are 
proved,  very  simply,  upon  the  assumption  that  the  network  can  be  replaced  by  an 
equivalent  T. 


292 


H.  W.  BODE 


well  understood  part  of  transmission  theory**  they  will  not  be  considered 
further  in  this  paper. 

Definition  of  a  Filter 

The  preceding  analysis  makes  no  distinction  between  filters  and 
general  four-terminal  reactive  networks.  A  simple  definition  of  a  filter, 
as  a  structure  in  which  currents  l}ring  within  specified  continuous  fre¬ 
quency  ranges  are  transmitted  freely  while  currents  of  all  other  frequen¬ 
cies  are  suppressed,  was  given  in  the  introduction.  This  conception  is 
clear  enough  in  intent  but  it  is  scarcely  suitable  as  a  basis  for  exact 
analysis.  One  difficulty,  suggested  by  our  discussion  of  the  preceding 
paragraph,  is  the  fact  that  the  transmission  characteristics  exhibited 
by  a  network  under  actual  operating  conditions  depend  both  upon  the 
network  and  upon  its  relations  to  its  terminating  impedances.  There 
is  no  inherent  filtering  property  residing  in  the  network  alone.  A 
second  difficulty  arises  if  we  observe  that  when  the  terminating  im¬ 
pedances  are  any  arrangements  of  ordinary  lumped  electrical  elements 
the  current  delivered  to  the  load  impedance  must  be  a  rational  function 
of  frequency.  It  follows  from  general  function  theoretic  considerations, 
consequently,  that  the  transmitted  energy  cannot  be  exactly  constant 
over  any  continuous  range  of  frequencies  unless  it  is  constant  in  all 
portions  of  the  frequency  spectrum.  Evidently,  therefore,  the  char¬ 
acteristics  we  can  actually  obtain  in  ordinary  circuits  can  be  only 
approximations  to  the  ideal  conception,  and  we  are  left  with  the  difficult 
question  of  trying  to  specify  what  constitutes  an  adequate  approxima¬ 
tion  in  all  the  variety  of  engineering  situations  to  which  filters  may  be 
applied. 

In  these  circumstances,  a  definition  of  filters  which  will  form  a  suitable 
basis  for  exact  analysis  must  necessarily  be  somewhat  artificial.  At  best 
it  must  represent  an  idealization  whose  relevance  to  the  characteristics 
obtainable  in  actual  circuits  is  a  matter  for  engineering  rather  than 
mathematical  judgment  to  determine.  The  definition  adopted  in  this 
paper  is  based  upon  the  image  parameters.  Since  previous  formal  filter 
analysis  has  also  been  based  upon  these  parameters,  this  choice  is  recom¬ 
mended  by  reasons  both  of  convenience  and  familiarity.  Primarily 
however,  its  validity  must  depend  upon  the  extent  to  which  the  image 

>*  See  for  example  K.  8.  Johnaon,  loc.  cit.,  or  T.  E.  Shea  "TranamiMion  Net- 
worka  and  Wave  Filtera"  Chap.  IV. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


293 


impedance  and  transfer  constant  can  be  taken  as  approximate  measures 
of  the  actual  impedance  and  transmission  characteristics  of  the  structure, 
in  the  manner  described  in  a  previous  section,  when  reasonable  care  is 
used  in  adjusting  the  network  to  its  terminations,  and  can  be  decided 
only  by  engineering  experience.  Explicitly,  we  shall  define  a  filter  to  be  a 
four-terminal  network  of  ordinary  lumped  elements  which,  as  judged  by  its 
transfer  constant  alone,  transmits  one  continuous  band  of  real  frequencies 
and  attenuates  all  other  real  frequencies.  Within  the  transmitted  band, 
then,  the  transfer  constant  must  be  a  pure  imaginary,***  which  signifies 
phase  shift  with  tero  attenuation,  while  at  other  frequencies  it  must 
have  a  positive  real  component. 

The  restriction  to  networks  transmitting  single  bands  may  demand  a 
word  of  explanation.  It  is  advanced  in  part  as  a  matter  of  convenience. 
Although  multiple  band-pass  filters  have  been  described  frequently  in  the 
past,  only  the  single  band-pass  types  appear  to  be  of  much  practical 
importance,  and  by  restricting  the  term  “filter”  explicitly  to  them 
several  advantages  in  simplicity  and  precision  of  statement  are  gained. 
There  is  also  a  further  reason.  It  will  appear  from  later  discussion  that 
the  most  general  four-terminal  reactive  network  is,  as  judged  from  its 
transfer  constant,  a  structure  containing  a  number  of  narrow  transmit¬ 
ting  bands  alternating  with  regmns  of  suppression.  In  practice, 
however,  such  narrow  transmitting  bands  are  much  obscured  by  reflec¬ 
tion  effects  and  parasitic  dissipation  of  energy  in  the  network  elements. 
Unless  the  restriction  to  single  bands  is  made,  therefore,  it  is  difficult 
either  to  draw  any  clear-cut  distinction  between  filters  and  general  net¬ 
works  or  to  have  any  assurance  that  a  structure  which  the  analysis 
describes  as  a  filter  will  have  a  reasonably  well-defined  characteristic  in 
practice. 

The  types  of  filters  transmitting  a  single  continuous  band  are  of 
course,  low-pass,  high-pass,  band-pass,  and,  (as  a  limiting  case),  all-pass. 
In  the  second  half  of  the  paper  appropriate  configurations  for  all  of 
these  various  types  will  be  discussed.  For  the  sake  of  simplicity, 
however,  the  intervening  analysis  will  consider  primarily  only  band-pass 
filters.  The  anal3r8is  can,  of  course,  be  extended  to  filters  of  other  types 
by  assigning  extreme  values  to  the  cut-offs. 

**  It  may  be  noticed  from  equations  (8)  that  this  restriction  cannot  be  met  by  a 
network  containing  dissipative  elements,  which  justifies  our  original  restriction 
to  networks  of  pure  reactances. 


I. 


294 


H.  W.  BODE 


PART  II.  PROPERTIES  OP  PHYSICALLY  REALIZABLE  SYSTEMS*' 


Tlie  general  circuit  equations  we  have  thus  far  considered  hold 
whether  the  network  contains  positive  or  negative  elements.  Since  the 
paper  will  be  primarily  concerned  with  physically  realizable  systems,  we 
must  now  turn  our  attention  to  the  additional  conditions  which  must  be 
met  if  the  equations  are  to  represent  ordinary  physical  networks. 

The  conditions  we  shall  use  are  borrowed  largely  from  classical  dy¬ 
namics.  The  connection  between  the  previous  circuit  analysis  and  gen¬ 
eral  dynamics  will  be  apparent  from  the  observation  that  equations  (1), 
which  are  usually  derived  directly  from  KirchhofT’s  laws,  can  equally 
well  be  looked  upon  as  the  ordinary  dynamical  solution  of  a  vibrating 
system. 

From  this  point  of  view,  the  system  of  equations  is  simply  an  in¬ 
tegrated  form  of  the  differential  expressions  resulting  from  the  applica¬ 
tion  of  Lagrange’s  equations  to  the  energy  functions** 


T 

V 


<-i  ,-i 

12  Ss- 


i-i  1-1 


(9) 


where  91 ...  9*  are  the  electrical  displacements  around  the  k  meshes 
of  the  network  and  ^1 . . .  ^*  are  their  time  derivatives.  The  displace¬ 
ments  9i .  . .  9»,  of  course,  serve  also  as  the  generalized  coordinates  of  the 
Lagrangian  solution. 


Pontive  Definitenest  of  the  Energy  Functions 

The  above  interpretation  of  our  earlier  equations  immediately  leads 
to  one  condition  which  must  be  satisfied  by  a  physically  realizable  solu- 


"  In  ronnrrtinn  with  this  topic,  the  early  work  by  Cauer,  although  it  deals 
chieBy  with  two-terminal  rather  than  four-terminal  networks,  should  be  of 
interest.  The  reader  is  referred  particularly  to  articles  in  Archiv  fUr  Elektro- 
technik,  Dec.  1926;  Sits.  ber.  Preuss.  Ak.  der  Wiss.,  1927;  and  Sits.  ber.  Berliner 
Math.  Cies.,  1928.  Mention  should  also  be  made  of  the  conditions  advanced  for 
the  general  four-terminal  network  by  Gewertz  (this  Journal,  Jan.  1933).  The 
latter  reference  also  gives  a  general  construction  method  for  four-terminal  net¬ 
works. 

**  These  relations,  if  they  are  not  evident  from  inspection,  can  readily  be  estab¬ 
lished  by  a  direct  computation  of  the  energies  associated  with  each  branch  of  the 
network  in  terms  of  the  mesh  currents  flowing  in  that  branch. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


295 


tion.  Evidently,  if  the  network  contains  only  positive  elements,  the 
energies  associated  with  its  inductances  and  capacities  must  be  positive 
for  any  physically  possible  distribution  of  currents  in  it.  The  quadratic 
forms  of  equation  (9)  must  therefore  be  positive  definite.  Since  the 
coefficients  of  these  forms  are  also  the  coefficients  in  (1)  this  gives  us  a 
necessary  condition  both  on  the  mesh  equation  solution  and  on  the 
formulae  for  the  image  parameters  derived  therefrom  in  equations 
(5),  (7)  and  (8). 

The  condition  of  positive  definiteness  is  not  only  necessary  but  also 
sufficient  for  the  analysis  of  filters  according  to  the  definition  given 
previously.  The  proof  of  this  statement,  how'ever,  will  require  a  lengthy 
analjrsis.  We  shall  begin  by  deriving  from  this  necessary  condition  a 
succession  of  others  more  immediately  interpretable  in  terms  of  filters 
and  their  characteristics.  This  will  consume  most  of  the  rest  of  the 
first  half  of  the  paper.  It  leads  finally  to  the  conclusion  that  when  the 
condition  of  positive  definiteness  is  satisfied,  the  possible  variety  of  filter 
characteristics  is  narrowed  down  to  a  comparatively  small  range  and 
that  any  of  the  characteristics  within  this  range  must  be  obtainable  as 
combinations  of  a  limited  number  of  elementary  characteristics.  The 
sufficiency  of  the  condition  is  then  established  in  the  second  half  of  the 
paper,  where  it  is  shown  by  direct  demonstration  that  a  physical  con¬ 
figuration  corresponding  to  each  of  these  elementary  characteristics 
can  be  found.  As  a  first  step  in  this  process  we  will  consider  the 

Properties  of  the  Determinants  Associated  mth  Physically  Realizable 

Networks 

Since  the  expressions  for  the  image  parameters  given  by  equations  (5) 
and  (7)  involve  the  network  determinant  A  and  the  various  minors 
All,  Am,  and  Ahm,  the  positive  definiteness  condition  must  amount  to  a 
set  of  restrictions  upon  these  quantities.  It  is  clear  that  all  of  the 
determinants  are  simply  polynomials  in  X(»«  p*).  They  can  therefore 
be  specified,  except  for  constant  multipliers,  by  their  zeros  and  our 
problem  reduces  to  that  of  deteiniining  the  restrictions  on  the  zeros 
which  follow  from  the  condition  of  positive  definiteness. 

The  consequences  of  the  positive  definiteness  condition  are  familiar 
both  in  ordinary  dynamics  and  in  the  algebra  of  matrices.”  In  ordinary 
dynamics,  for  example,  the  roots  of  the  equation  A  at  0  are  simply  the 
normal  coordinates  of  the  structure.  Similarly,  the  roots  of  the  equation 

**  See,  for  example,  Whittaker’s  "Analytical  Dynamics”  and  Bdcher’s  “Higher 
Algebra.”  , 


296 


H.  W.  BODE 


All  ai  0  represent  the  normal  coordinate  solution  obtained  when  a  con¬ 
straint  which  prevents  displacements  around  the  first  mesh  is  applied 
to  the  network;  the  roots  of  An  —  0  represent  the  normal  coordinate 
solution  when  the  constraint  is  applied  to  the  second  mesh,  and  the  roots 
of  Ann  *  0  a  similar  solution  when  the  constraint  is  applied  to  both 
meshes.  The  restrictions  on  the  roots  of  these  polynomials  can,  there¬ 
fore,  be  obtained  from  the  usual  normal  coordinate  solutions  for  free  and 
constrained  vibrating  systems  having  positive  definite  energy  functions. 
The  algebra  of  matrices  can  be  applied  to  the  problem  if  we  observe  that 
since  each  coefficient  in  equation  (l)isof  the  form  l/Ca  -f-  XLo  the  com¬ 
plete  system  of  coefficients  in  (1)  represents  a  “X-matrix”  ||4»  -|-  X^||. 
We  can  consequently  make  use  of  the  algebraic  theory  of  elementary 
divisors  developed  for  such  matrices  in  our  problem. 


I  i  I  !  11  l'! 


pseoueNCY  — ♦ 

Fio.  13.  Normal  roordinste  relations  in  a  general  network 


Since  both  of  these  fields  are  familiar,  they  need  not  be  reviewed  in 
detail.  Upon  combining  the  results  obtained  from  them  with  the 
results  obtained  by  (iirect  inspection,  we  readily  find  that  the  determi¬ 
nants  have  the  following  properties: 

1.  It  is  evident  from  inspection  that  the  determinants  A,  An,  An, 
Aniii  are  polynomials  in  X  of  respectively  the  A;-th,  k-lst,  Ar-lst  and  A-2nd 
degrees  with  real  coefficients. 

2.  The  coefficient  of  the  highest  power  of  X  in  each  polynomial  is  a 
determinant  involving  the  L’s  alone  and  the  coefficient  of  the  lowest 
power  is  a  determinant  involving  the  C’s  alone.  If  the  energy  functions 
of  the  network  are  positive  definite  forms,  both  of  these  determinants  are 
necessarily  positive.  With  the  help  of  (3),  below,  then,  we  can  conclude 
that  the  coefficients  of  the  pol3momial8  are  not  only  real  but  positive. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


297 


3.  The  roots  of  the  pol3momialB  occur  only  at  negative  real  values  of 
X.  Since  X  »  —Air*P,  each  such  root  corresponds  to  a  pair  of  roots  of 
positive  and  negative  real  frequencies. 

4.  Since  the  normal  coordinates  of  a  constrained  system  fall  between 
those  of  the  unconstrained  system,  the  k  roots  of  A  ■>  0  are  separated  by 


II 


All 

A22 

A|i*s 


a 

•  Am 

A22 
A||2*' 

FSeOUtNCY— • 

Fiq.  15 

A- 
All' 

A22 
A 112*' 

FSeOUENCV  —  - 

Fio.  16 

Flos.  14-16.  Special  normal  coordinate  relations  possible  in  physical  networks 

the  ft  —  1  roots  of  An  >  0  and  also  by  the  ft  —  1  roots  of  Am  ^  0.  These 
last  two  sets  of  roots  are  in  turn  separated  by  the  ft  —  2  roots  of  Antt  —  0. 
The  arrangement  is  illustrated  by  Fig.  13,  the  roots  being  indicated  by 
circles  on  the  various  horisontal  lines. 

5.  The  case  envisaged  in  the  preceding  paragraph  is  the  normal  one 
in  which  all  of  the  roots  are  simple.  In  special  cases  multiple  roots  may 


I 


FSeOUCNCY^— 

Flo.  14 


298 


H.  W.  BODE 


occur.  For  example,  it  is  clear  that  if  all  of  the  branch  impedances  of 
the  network  vanish  at  the  same  frequency,  the  determinant  A  will  have 
a  root  of  the  ib-th  order  at  that  frequency.  Multiple  roots  may  be 
treated  by  the  method  of  elementary  divisors.  They  may  also  be  looked 
upon,  perhaps  more  simply,  as  a  superposition  of  the  simple  roots 
illustrated  by  Fig.  13.  Either  method  allows  us  to  conclude  that  if  any 
determinant  of  the  set  A,  An,  An,  Ann,  has  a  root  of  the  nth  order  at 
any  frequency,  the  determinants  of  the  next  higher  and  lower  orders 
must  have  roots  whose  orders  do  not  differ  from  n  by  more  than  one. 
Several  possible  arrangements  obeying  this  rule  are  shown  by  Figs.  14, 
15  and  16,  where  the  multiplicity  of  the  roots  is  indicated  by  the  super¬ 
position  of  the  circles  representing  simple  roots.  Only  a  portion  of 
each  system  of  roots  is  sho«7i.  The  diagrams  are  typical  of  arrange¬ 
ments  which  will  play  an  important  rdle  in  our  later  discussion. 

6.  It  is  obvious  from  paragraph  (2)  that  all  of  the  determinants  must 
be  positive  real  quantities  at  all  points  on  the  positive  real  X-axis. 

7.  The  last  restriction  is  expressed  essentially  by  the  identity  given  in 
equation  (2).  The  determinant  An  on  the  right-hand  side  of  this 
equation  is  not  one  to  which  the  preceding  restrictions  apply.  It  is 
evident  from  inspection,  however,  that  it  will  at  least  be  a  polynomial  in 
X  with  real  coefficients.  Since  An  appears  as  a  square  in  equation  (2), 
we  can  conclude  that  the  quantity  AnAn  —  AAmt  will  be  positive 
at  real  values  of  X  and  that  its  zeros  will  be  of  even  order  and  will  be 
either  real  or  conjugate  complex  quantities. 

Consequences  of  the  Alternation  of  Zeros  and  Poles — Foster's  Theorem 

Conditions  (6)  and  (7)  in  the  above  list  will  not  be  used  until  a  later 
point  in  our  analysis.  The  remaining  conditions,  however,  can  be  used 
immediately  to  reduce  the  general  expressions  given  by  equations  (5) 
and  (7)  to  more  manageable  forms.  In  all  of  these  expressions,  the 
determinants  A,  Am  etc.  appear  only  as  ratios.  All  of  the  ratios,  more¬ 
over,  are  between  determinants  of  adjacent  orders.  So  far  as  any 
individual  ratio  is  concerned,  therefore,  the  possible  existence  of  the 
multiple  zeros  considered  in  paragraph  (5)  above  can  be  ignored  and  we 
can  address  ourselves  to  the  problem  as  though  we  had  to  consider  only 
the  arrangement  of  simple  zeros  and  poles  shown  by  Fig.  13. 

If  to  these  considerations  we  add  the  requirement  that  zeros  of  con¬ 
strained  and  unconstrained  systems  must  separate  one  another  our 
expressions  can  be  written  in  a  definite  functional  form.  Any  particular 
ratio,  such  as  A/Au  for  example  must  appear,  as 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


299 


An 


(10) 


where  A;  Ls  a  poeitive  real  constant  and  0  S  Xi  <  X|  <  X»  •  •  •  <  X«.  The 
expression  contains  one  more  factor  in  the  numerator  than  in  the 
denominator  because  A  is  one  degree  higher  than  An.  Since  we  do  not 
know  how  many  multiple  roots  may  have  disappeared  in  taking  the 
ratio,  however,  it  is  impossible  to  tell  what  relation  the  number  n  bears 
to  the  original  order  of  the  equations.  The  expression  may  be  taken  as  a 
general  one  except  for  the  fairly  obvious  limitation  that  the  lowest  and 
highest  zero,  which  have  l)een  represented  as  occurring  at  the  finite 
points  —  Xi  and  —  X»,  may  in  extreme  ca.ses  assume  the  limiting  values 
zero  and  infinity.  These  extreme  cases  are,  however,  of  particular  im¬ 
portance  since  they  are  of  frequent  occurrence  in  network  practice. 

Since  such  expressions  as  (10)  will  be  much  in  evidence  in  our  future 
analysis,  it  is  desirable  to  write  them  more  compactly.  We  shall  adopt 
the  notation  illustrated  by 


A .  jfc  ’  (n) 

An  flj  04  '  •  •  flu-! 

where  each  a  replaces  one  of  the  factors  in  (10).  The  a’s  will  be  num¬ 
bered  in  the  order  in  which  the  zeros  and  poles  they  represent  occur 
along  the  axis  of  real  X's.  When  it  is  necessary  to  distinguish  between 
two  or  more  ratios  of  determinants  we  may  use  h’s  and  c’s  as  well  as  a’s. 
It  will  also  be  convenient  to  use  such  expressions  as  ‘*the  zero  ui”  and 
“when  /  (or  X)  «  a”  or  simply  “at  a”  to  mean  either  the  frequency  or 
the  X  at  which  the  zero  represented  by  Oi  occurs.  In  general,  the  a’s 
will  be  called  “critical  frequency  factors”  and  the  zeros  and  poles  which 
they  represent,  “critical”  or  “natural  frequencies.”** 

For  many  of  our  future  purposes  it  will  be  convenient  to  express  this 
result  in  a  slightly  different  form.  It  is  easy  to  see  from  the  argument 
upon  which  equation  (8)  was  based  that  the  driving-point  impedance  of  a 

**  It  ia  customary  to  consider  only  the  seros  of  A  as  the  "natural  frequencies’’ 
since  they  represent  the  oscillations  which  the  structure  can  sustain  in  the  absence 
of  a  driving  force.  By  extending  the  term  to  all  of  the  seros  and  poles  which  may 
be  found  in  the  various  ratios  of  determinants  involved  in  equations  (5)  and  (7), 
we  have,  therefore,  also  included  zeros  of  An,  Ati  and  Amt,  which  represent 
"natural  frequencies"  not  of  the  original  network  but  of  the  network  when  suit¬ 
able  constraints  are  imposed  at  one  or  both  ends. 


k 


300 


H.  W.  BODE 


general  reactance  network  can  be  expreeaed  as  Z  »  -  — .  The  general 

V  All 

expression  for  the  impedance  of  a  two-terminal  reactance  network  can, 
therefore,  be  written  as 

Z  -  -  -  (12) 

p  at  at  •  •  •  0,-1 

where  k  and  ai . . .  a«  have  the  significance  previously  described. 

This  description  of  the  impedance  characteristics  of  two-terminal  re¬ 
actance  networks  was  first  given  (except  for  differences  in  notation)  by 
R.  M.  Foster,*  who  obtained  it  from  classical  dynamics  in  a  fashion 

. 

Fig.  17.  Canonical  form  for  two-terminal  reactive  networks 


Fig.  18.  Canonical  form  for  two-terminal  reactive  networks 

somewhat  similar  to  the  foregoing.  In  the  construction  of  reactive 
networks  the  most  convenient  configurations  are  two  canonical  forms 
first  described  and  used  by  Zobel.**  They  are  respectively  the  set  of 
anti-resonant  circuits  in  series  shown  by  Fig.  17  and  the  set  of  resonant 
circuits  in  parallel  shown  by  Fig.  18.  Foster  also  showed  that  when  the 
requirements  we  have  laid  down  for  (12)  are  satisfied  a  physical  network 

**  "A  Reactance  Theorem,”  Bell  System  Technical  Journal,  Apr.  1924.  Foster 
is  responsible  for  the  definitive  formulation  given  in  this  paper  but,  as  he  himself 
remarks,  much  of  the  content  of  the  theorem  had  already  been  published  by  both 
Campbell  and  Zobel.  See  papers  previously  cited. 

"  Loc.  cit. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


301 


in  either  configuration  can  be  obtained  by  a  partial  fraction  expansion 
of  either  the  impedance  function  itself  or  its  reciprocal. 

Since  the  Zbc  and  Zoc  of  equations  (8)  are  simply  two  terminal 
reactances,  Foster’s  formulation  can  be  used  in  conjunction  with  these 
equations  as  an  alternative  to  the  general  formulation  for  the  image 
parameters  in  terms  of  A  and  its  minors.  A  more  important  application 
of  Foster’s  results,  however,  will  appear  when  we  come  to  the  problem  of 
constructing  physical  networks  to  represent  filters  of  general  types, 
since  we  w'ill  be  able  to  consider  the  problem  solved  as  soon  as  we  can 
show  that  each  branch  of  the  structure  satisfies  Foster’s  restrictions. 

Extension  of  the  Analysis  to  Networks  of  Other  Types 

The  discussion  thus  far  has  been  restricted  to  networks  of  pure  re¬ 
actances  and  will  continue  to  be  so  restricted,  at  least  where  all  explicit 
statements  are  concerned,  in  the  future.  It  is  of  some  interest,  however, 
to  observe  that  many  of  the  results  can  be  extended  to  networks  of 
other  types.  For  example,  if  the  network  is  one  of  inductances  and 
resistances,  it  will  be  characterised  by  a  stored  energy  function  T  and  a 
dissipation  function  F,  both  of  which  must  be  positive  definite  quadratic 
forms  if  the  structure  is  to  be  physically  realisable.  All  of  our  analysis 
of  such  forms  will  therefore  apply,  at  least  formally,  to  this  structure. 

This  generalisation  can  be  made  somewhat  more  concrete  if  we  observe 
that  the  reactive  structure  can  be  considered  as  a  network  which  has 
been  made  up  of  multiples  of  two  basic  impedances,  Zi  and  Zt,  where  Zi 
represents  the  impedance  of  a  unit  inductance  and  Zt  the  impedance  of  a 
unit  capacity.  In  the  general  system  of  mesh  equations  given  by  (1), 
the  variable  X  on  the  left-hand  side  evidently  represents  the  ratio  Zi/Zt 
while  p,  on  the  right  side,  represents  l/Zt.  It  is  easy  to  see  that  the  ex¬ 
pressions  will  not  be  essentially  changed  if  for  the  unit  coil  and  unit 
condenser  we  introduce  any  other  two  types  of  impedance  elements. 
We  can,  therefore,  modify  equation  (12)  to  meet  these  more  general 
conditions  simply  by  replacing  X  and  p  by  their  values  in  terms  of  the 
new  impedance  elements. 

As  an  example  of  this  process  we  may  take  a  reactive  network  whose 
coils  and  condensers  are  slightly  dissipative.  If  we  let  di  and  dt  be 
the  dissipation  constants  of  the  two  types  of  elements  the  impedance 
of  a  t3rpical  coil  can  be  written  as  (1  —  idi)iuL  and  that  of  a  typical 
condenser  as  1/(1  —  id)iuC.  Equation  (12)  will  still  be  a  valid  expres¬ 
sion  for  the  impedance  of  the  network  provided  we  let  p  represent 
(1  —  idt)io>  and  let  the  X  which  occurs  in  each  of  the  a’s  represent 


302 


H.  W.  BODE 


— w*(l  —  id\)  (1  —  idt).  This  particular  extension  of  Foster’s  theorem 
is  especially  valuable  since  with  its  help  we  can  easily  extend  all  of  the 
formulae  we  will  later  develop  for  ideal  reactive  networks  to  include  the 
effects  of  the  small  parasitic  resistances  which  are  unavoidable  in 
practice. 

PART  III.  GENERAL  DESCRIPTION  OF  THE  EXTERNAL  CHARACTERISTICS 

or  FILTERS 

As  we  have  previously  stated,  formulae  of  the  t3rpe  given  by  (10)  do 
not  represent  a  complete  statement  of  the  conditions  on  physically 
realizable  networks  because  they  neglect  points  6  and  7  on  the  preceding 
list.  Even  without  the  help  of  these  last  two  conditions,  however,  we 
will  be  able  to  make  considerable  progress  in  the  analysis  of  filters.  In 
this  portion  of  the  paper  we  shall  attempt  to  obtain  a  general  descrip¬ 
tion  of  the  external  characteristics  of  physically  realizable  filters  and  a 
statement  of  the  essential  distinction  between  filters  and  general  four- 
terminal  reactive  networks.  The  method  we  will  use  is  somewhat 
similar  to  that  by  which  the  characteristics  of  the  lattice  or  Wheatstone 
bridge  configuration  are  expressed  in  terms  of  the  impedances  of  its 
series  and  lattice  branches.”  It  is  reviewed  here  partly  for  the  sake  of 
the  greater  generality  of  the  present  discussion  and  partly  to  establish 
the  terms  which  will  be  employed  in  later  anal}rsis. 


Coincidence  Conditions  for  Fillers 

For  the  sake  of  simplicity,  we  shall  neglect  for  the  moment  the  con¬ 
ditions  at  the  second  end  of  the  structure.  It  follows  from  equations  (5) 
and  (7)  then,  that  the  transfer  constant  and  the  image  impedance  at  the 
first  end  depend  only  upon  the  ratios  A/An  and  Aa/Aun,  which  in  the 
notation  adopted  in  the  previous  section  can  be  written  as 


and 


A  .  o,  . . .  o, 

—  M  E]  - - — - - 

All  O*  •  •  •  0,-1 


(13) 


,  jL 

Alin  *  bt 


6i  . . . 


6«.i 


(14) 


The  general  expressions  for  the  transfer  constant  and  the  image  imped¬ 
ance  consequently  become 


tanh*  9 


hi  ‘  ‘  ‘  bj  •  •  •  hm-l 
Ih  Oi  •  •  •  0,-1  bi  ’  •  ‘  bm 


(15) 


"  Vf.  Csuer  or  H.  W.  Bode,  loe.  cit.,  or  see  later  diacuaaion  under  the  heading, 
“Symmetrical  Structures.” 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


303 


and 


A 


fcl  <*1  •  •  •  flu  bi  •  •  •  bm 

P*  Oj  •  •  •  a,_i  6i  •  •  •  bm-l 


(16) 


where  Zi  has  been  written  for  brevity  in  place  of  Zn. 

We  know  from  our  previous  discussion  that  the  network  transmits  if 
tanh*  9  <  0  and  attenuates  if  tanh*  9  >  0.  In  general,  of  course,  the  sign 
of  (15)  will  change  at  each  of  the  a’s  and  each  of  the  b’s  unless  an  a  and  a 
b  happen  to  be  equal.  If  we  disregarded  for  the  moment  our  previous 
definition  of  a  filter,  therefore,  we  might  say  that  the  most  general  four- 
terminal  reactive  network  is  a  multiple  band-pass  filter.  Since,  how¬ 
ever,  we  have  agreed  to  consider  as  filters  only  structures  transmitting 
single  bands,  for  reasons  given  previously,  we  must  turn  to  the  special 
conditions  which  the  netw'ork  must  satisfy  if  it  is  to  meet  this  require¬ 
ment.  They  are  evident  from  inspection.  Clearly,  the  structure  must 
be  so  arranged  that  all  of  the  a’s  are  equal  to,  or  “coincide  with”  b’s 
except  in  two  cases.  The  two  exceptional  cases,  of  course,  represent 
cut-offs,  or  frequencies  at  which  the  network  changes  from  a  condition 
of  transmission  to  one  of  attenuation  or  vice-versa.  For  our  present 
purposes,  it  is  a  matter  of  indifference  whether  they  are  both  a’s,  both 
b’s,  or  one  a  and  one  b. 

Our  discussion  of  the  properties  of  the  roots  of  the  determinants 
suggests  that  these  necessary  coincidences  or  equalities  between  the  a’s 
and  the  b’s  may  occur  naturally  in  two  general  wa3r8.  The  first  way 
gives  rise  to  coincidences  between  odd  a’s  and  even  b’s  or  vice-versa  and 
since  only  two  of  the  a’s  and  b’s  can  remain  unpaired,  it  can  be  repre¬ 
sented  by  the  formula  ay  —  by^i.  We  may  take  as  an  example  the 
coincidence  of  an  odd  a  with  an  even  b.  Evidently  these  two  must  be 
respectively  roots  of  the  equations  A  »  0  and  Ann  ~  0.  As  our  discus¬ 
sion  of  Fig.  13  shows  one  root  each  of  A  and  Ann  must  fall  between 
successive  roots  of  An  and  An-  The  roots  of  A  and  Ann  may  therefore 
be  paired  off,  each  pair  lying  within  one  of  the  intervals  between  succes¬ 
sive  roots  of  An  or  Ait.  The  two  members  of  any  one  pair,  since  they 
fall  within  the  same  restricted  range,  must,  of  course,  be  approximately 
equal.  The  coincidence  condition  in  this  case,  therefore,  merely  de¬ 
mands  that  the  equality  be  exact.  Similarly,  the  coincidence  of  even 
a’s  with  odd  b’s  demands  the  equality  of  roots  of  the  equations  An  »  0 
and  An  »  0.  Since  the  roots  of  An  and  An  can  also  be  paired  off  into 
couples,  each  l3dng  within  one  interval  between  successive  roots  of  A  and 
Ann,  the  same  condition  evidently  applies. 


Ik 


304 


H.  W.  BODE 


Figure  13,  it  will  be  remembered,  was  drawn  on  the  assumption  that 
the  determinants  had  only  simple  roots.  Multiple  roots,  on  the  other 
hand,  give  rise  to  coincidences  between  even  o’s  and  even  6’s  or  odd  a’s 
and  odd  5’s.  They  may  be  represented  by  the  general  formulae  o,  6/ 
or  oy  —  6,^1.  For  example,  it  is  clear  that  a  multiple  root  arrangement 
of  the  sort  shown  by  Fig.  14  will  give  rise  to  a  coincidence  between  an 
odd  a  and  an  odd  b,  after  common  factors  have  been  cancelled  out  in  the 
ratios  of  the  determinants.  On  the  other  hand,  multiple  root  arrange¬ 
ments  of  the  sorts  shown  by  Fig.  15  and  Fig.  16  will  give  rise  to  coin¬ 
cidences  between  even  a’s  and  even  &’s. 

Coincidences  between  even  a’s  and  even  6’s  or  between  odd  a’s  and 
odd  6’s  can  also  be  obtained  in  special  circumstances  even  when  the  de¬ 
terminants  have  only  simple  roots.  Such  a  coincidence  might  be  found, 
for  example,  if  a  simple  root  of  A  were  to  coincide  with  a  simple  root  of 
Aft.  Since  the  normal  coordinates  of  the  constrained  system  represented 
by  Aft  separate  those  of  the  unconstrained  system  represented  by  A, 
in  general,  this  evidently  represents  an  extreme  case.  It  will  be  en¬ 
countered,  however,  if  the  displacement  around  the  terminals  to  which 
the  constraint  is  applied  can  be  represented  as  a  linear  combination  of 
less  than  the  total  number  of  normal  coordinates  of  the  unconstrained 
system."  This  distinction  between  the  two  methods  by  which  coin¬ 
cidences  of  the  second  type  can  be  produced  is  of  some  importance, 
since  it  corresponds  to  a  distinction  between  factors  which  are  found  in 
the  image  impedance  at  both  ends  of  a  network  and  factors  found  in  the 
image  impedance  at  one  end  only.  It  will  be  discussed  in  more  detail 
in  a  later  section. 

Impedance  and  Tranter  Constant  Controlling  F requencies 

The  distinction  between  the  two  types  of  coincidences  which  has  been 
established  from  a  consideration  of  the  ways  in  which  they  may  originate 
physically,  is  also  important  from  other  points  of  view.  For  example, 
it  corresponds  to  a  distinction  between  critical  frequencies  which  are 
found  in  the  transmission  and  attenuation  regions  of  the  filter.  This  is 
easily  seen  if  we  observe  that  at  X  0  all  a’s  and  6’8  are  positive  real 
quantities.  The  origin  must  therefore  fall  in  an  attenuation  band. 
The  attenuating  range  will  persist  until  we  come  to  some  factor,  a,  or  6,, 
which  is  not  paired  with  a  factor  of  the  other  t}rpe  and  consequently 
represents  a  cut-off.  Below  this  point,  any  coincidences  which  occur 

"That  is,  when  some  of  the  A’»  in  Whittaker’s  analysis  ("Analytical  Dy¬ 
namics,"  p.  191)  are  sero. 


f 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  305 

must  evidently  be  those  given  by  the  formulae  Ui  s  &i,  at  »  bt,  etc.,  or 
in  general,  by  ay  »  6y.  They  are,  therefore,  coincidences  belonging  to 
the  second  of  the  two  types  described  above.  Beyond  the  cut-off,  coin¬ 
cidences  of  critical  frequencies  can  again  occur,  of  course,  but  since 
either  the  chain  of  a’s  or  the  chain  of  b’a  must  have  lost  one  step  in  the 
alternation  of  zeros  and  poles  because  of  the  interjection  of  the  cut-off 
factor,  the  coincidences  will  now  be  represented  by  the  formula  oy  »  6y±i 
and  will  be  of  the  first  type.  Eventually,  we  may  encounter  a  second 
cut-off  factor,  a.  or  b„  and  again  pass  into  a  region  of  attenuation.  At 
the  second  cut-off  a  second  step  in  the  alternation  of  zeros  and  poles  is 
lost  by  either  the  a  chain  or  the  b  chain  and  further  coincidences  will, 
therefore,  be  represented  by  one  or  the  other  of  the  formulae  ay  *  6y,  or 
bi^t,  and  will  again  belong  to  the  second  type.  In  general,  there¬ 
fore,  we  can  say  that  coincidences  of  the  first  type  will  be  found  only  in 
transmission  bands  and  coincidences  of  the  second  type  only  in  attenua¬ 
tion  bands. 

There  is  a  more  important  reason,  however,  for  making  the  distinction 
between  the  two  types  of  coincidence  of  critical  frequencies.  They 
affect  the  transmission  characteristics  of  the  filter  in  very  different  ways. 
This  is  easily  seen  by  an  inspection  of  the  formulae  for  $  and  Zt  given  by 
equations  (15)  and  (16).  For  example,  in  the  expression  for  Zt  given 
by  equation  (16),  factors  coincident  in  the  first  way  simply  cancel  out. 
In  the  expression  for  B  given  by  equation  (15),  on  the  other  hand,  they 
multiply  together  to  form  double  zeros  and  poles.  The  first  type  of 
coincidence  thus  leads  to  factors  which  affect  B  but  not  Zt.  On  the  other 
hand,  factors  coincident  in  the  second  way  disappear  in  equation  (15), 
but  appear  as  double  zeros  and  poles  in  equation  (16).  These  factors, 
therefore,  affect  Zt  but  not  B.  The  distinction  between  the  first  and 
second  type  of  critical  frequency  coincidence  is  thus  equivalent  to  a 
distinction  between  what  may  be  called  “transfer  constant  controlling 
factors,”  and  “impedance  controlling  factors.” 

The  fact  that  the  two  groups  of  coincident  factors  affect  the  image 
impedance  and  the  transfer  constant  separately,  naturally  makes  them 
valuable  design  parameters.  The  paper  on  “Ideal  Filters”  referred  to 
previously  furnishes  a  number  of  examples  of  their  use  in  this  way. 
The  cut-off  factors,  of  course,  appear  in  both  the  image  impedance  and 
the  transfer  constant  expressions  and  to  this  extent,  therefore,  the  two 
are  not  independent.  It  should  be  observed  however,  that  a  given  cut¬ 
off  will  appear  as  a  pole  in  both  expressions  or  as  a  zero  in  both  expres¬ 
sions  only  if  it  is  one  of  the  group  of  a  factors.  By  assigning  it  to  the 


f 


306 


H.  W.  BODE 


m 


group  of  b  factors,  we  can,  if  we  like,  make  it  a  pole  in  one  expression 
and  a  lero  in  the  other.  Although  the  cut-off  frequencies  may  be 
prescribed,  therefore,  there  is  still  some  possibility  of  choice  in  the  way 
in  which  they  enter  the  expressions  for  Z*  and  tanh*  8. 

Summary  of  Properties  of  the  Image  Parameters 

Since  we  shall  deal  with  expressions  for  Zt  and  8  extensively  in  the 
future,  it  is  desirable  to  state  their  properties  as  definitely  as  possible. 
The  conclusions  resulting  from  the  preceding  discussion  may  be  sum¬ 
marized  as  follows: 

1.  Tanh*  0  is  a  rational  function  of  X  containing  only  double  zeros 
and  poles  except  for  two  zeros  or  poles  which  are  simple.  Z*  is  a  similar 
rational  function  multiplied  by  1/p*. 

2.  The  simple  zeros  or  poles  occur  at  negative  real  values  of  X,  repre¬ 
sent  cut-offs,  and  are  the  same  for  the  two  expressions  except  for  the 
possibility  that  either  or  both  may  be  a  zero  in  one  expression  and  a  pole 
in  the  other. 

3.  In  each  function  the  double  zeros  and  poles  occur  only  on  those 
parts  of  the  negative  real  X-axis  at  which  that  function  is  negative.  In 
tanh*  8  this  range  is  the  transmission  band  of  the  filter.  In  Z*  it  is  the 
attenuation  band. 

4.  It  follows  from  the  alternation  of  zeros  and  poles  in  the  original 
expressions  for  A/Au  and  An/Aunthat  all  of  the  zeros  and  poles  in  tanh*  8, 
w'hether  double  or  simple,  alternate  with  each  other  along  the  negative 
real  X-axis.  The  zeros  and  poles  of  Z*  also  alternate,  with  the  possible 
exception  of  the  step  between  the  two  cut-offs,  which  may  both  be  poles 
or  both  zeros.  In  Z*  the  alternation  includes  the  pole  represented  by 
the  factor  1/p*. 

t 

Illustrative  Expressions  for  Zi  and  Tanh  8 

The  various  coincidence  conditions  can  be  studied  conveniently  with 
the  aid  of  a  “frequency  pattern”  diagram**  such  as  that  shown  by 
Fig.  19.  In  preparing  the  diagram,  the  convention  has  been  adopted  that 
zeros  will  be  represented  by  circles  and  poles  by  crosses.  The  ratios 
A  Au 

“  and  “ —  corresponding  to  the  pattern  of  Fig.  19,  are  therefore 


**  Somewhat  similar  diagrams  are  given  by  Quillemin  in  his  discussion  of  the 
lattice  in  this  journal  for  June  1932. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


307 


and 


An  ^  O1O4Q7 
Alin  o$€i4 


(18) 


A  Ad 

Since  Zac  and  Zoc  differ  from ~ and  t —  only  by  the  multiplier-,  we 

An  Ann  p 

can  also,  if  we  like,  consider  the  diagram  as  a  representation  of  these 
impedances.  In  this  case,  of  course,  the  circles  and  crosses  must  be 
identified  with  resonances  and  anti-resonances.  The  diagram  will 
frequently  be  used  with  this  significance  in  the  future. 

Upon  inspecting  either  the  diagram  or  the  equations,  it  will  be  ob¬ 
served  that  at  and  oi  are  critical  frequencies  which  appear  in  one  im¬ 
pedance  and  not  in  the  other.  They  therefore  represent  cut-offs,  which 
are  symbolised  in  Fig.  19  by  inclosing  the  corresponding  cross  or  circle 
with  a  square.  Also,  at  and  04  are  natural  frequencies  coincident  in  the 

first  marmer  while  Ui,  a«  and  a?  are  natural  frequencies  coincident  in  the 


4  s 

rSEOOCNCY  — • 


Fig.  10.  Typical  frequency  pattern  for  a  band-paaa  network 


second  manner.  The  structure  is  a  band-pass  filter  transmitting  from 
at  to  a».  The  frequencies  at  and  04  fall  within  the  transmitted  band 
while  ai,  a«  and  Or  fall  outside  of  it.  The  image  impedance  and  transfer 
constant  are  given  by 


(19) 

(20) 


from  which  it  appears  that  the  critical  factors  at  and  04  affect  only  the 
transfer  constant  and  ai,  a<  and  a;  only  the  image  impedance.  The 
cut-off  factors  \/^  and  are  found,  of  course,  in  both  expressions. 
It  is  evident  that  by  varying  the  number  of  transfer  constant  and 
impedance  controlling  frequencies  and  by  changing  the  arrangement  of 
the  cut-offs  a  considerable  diversity  of  expressions  can  be  secured. 


occurs  at  zero,  by  its  value  X  the  corresponding  expressions  for 

the  image  impedance  and  tanh  d  are 

'T  ^  /anr  y/^*  /oi\ 


308 


H.  W.  BODE 


In  agreement  with  the  assumption  made  in  an  earlier  section,  all  of 
this  analysis  has  considered  explicitly  only  the  formation  of  band-pass 
filters.  It  is  evident,  however,  that  the  extension  of  the  discussion  to 
low-pass,  high-pass  and  all-pass  filters  presents  no  difficulty.  We  have 
merely  to  consider  the  band-pass  filter  in  extreme  cases  when  the  lower 
and  upper  cut-offs  go  to  the  limiting  values  zero  and  infinity  respec¬ 
tively.  Since,  as  we  have  already  pointed  out  in  connection  with  the 
discussion  of  equation  (10),  the  extreme  critical  frequencies  of  the  net¬ 
work  may  assume  these  values,  there  is  no  physical  objection  to  this 
process. 

An  example  of  a  low-pass  filter  is  furnished  by  the  critical  frequency 
diagram  of  Fig.  20.  If  we  replace  the  critical  frequency  factor  Oi,  which 


Fio.  20.  Typical  frequency  pattern  for  a  low-paaa  network 


Fio.  21.  Typical  frequency  pattern  for  an  all-pass  neta’ork 


and 


The  critical  frequency  pattern  for  an  all-pass  structure  is  shown  by 
Fig.  21.  The  corresponding  expressions  for  Z/  and  tanh  B  are 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  309 

and 

ianhe  ~  (24) 

y  Kt  oiOi 

It  will  be  observed  that  in  the  all-pass  case  the  open-  and  short-circuit 
impedance  networks  are  inverse. 

These  examples  are,  of  course,  hypothetical.  In  order  to  illustrate 
critical  frequency  coincidences  by  a  definite  structure,  we  may  refer  to 
the  classic  loaded  string  problem.  The  usual  discussion  of  this  problem 
assumes  that  both  ends  of  the  string  are  fixed.  In  order  to  consider  the 
arrangement  as  a  filter,  however,  it  is  necessary  to  suppose  that  the 
ends  of  the  string  are  originally  free  to  vibrate  and  that  at  each  end  there 
is  a  bead  whose  mass  is  half  that  of  each  of  the  center  beads.  The 
various  constraints  which  are  imposed,  then,  are  equivalent  to  fixing  one 
or  the  other,  or  both,  of  the  beads  at  the  ends  of  the  string.  When 
both  are  fixed  the  arrangement  reduces  to  the  usual  case. 

It  is  easy  to  show  that  the  determinant  of  the  system  when  both  ends 
of  the  string  are  free  to  vibrate  is 


c 

2’ 

-h 

0, 

0,  . 

0, 

0, 

0 

-1. 

C, 

-1. 

0,  . 

0, 

0, 

0 

0, 

-1, 

C, 

-1.  • 

0, 

0, 

0 

. 

0, 

0, 

0, 

0,  • 

•  -1. 

C, 

-1 

0, 

0, 

0. 

0,  •• 

0, 

-1, 

c 

2 

where  C  »  2  —  motifs  and  m,  a,  a  and  v  are  respectively  the  masses  of  the 
center  beads,  their  distance  apart,  the  tension  of  the  string,  and  the 
angular  velocity  of  the  vibration.  The  determinant  will  have  k  rows  if 
there  are  k  beads  all  told.  Since  the  first  and  ibth  coordinates  (instead 
of  the  first  and  second,  as  in  our  previous  discussion)  are  the  ones  to 
which  the  constraints  are  applied,  the  determinants  whose  roots  we  wish 
to  find  are  A,  An,  A**  and  Aiu*. 

In  the  fashion  made  familiar  by  Rayleigh***  and  others,  these  determi- 


••  "Theory  of  Sound"  |120. 


310 


H.  W.  BODE 


nant«  can  be  evaluated  by  the  subetitution  C  —  2  coe  ip.  The  results 
are 

A  ~  —sin  (k  —  l)<p  nintp; 

An  »  Am  -  cos  (*  -  1)^; 


Aiiu 


sin  (k  —  1)^ 
sin  if> 


The  critical  frequencies  are,  of  course,  the  values  of  X  corresponding  to 
the  ^’s  for  which  these  various  quantities  vanish.  A  vanishes  for 
^  »  nv/(k  —  1),  where  n  —  0,  1,  2,  .  .  .  ik  —  1;  Amt  vanishes  for  the 
same  values  of  with  the  exception  of  0  and  r;  and  the  remaining 
quantities.  An  and  At*,  vanish  when  <p  ~  (2n  —  l)ir/(2ik  — 1),  where 
n  -  1,  2,  1. 


It  is  evident  by  inspection  that  the  condition  of  coincidence  of  critical 
frequencies  is  satisfied.  The  roots  of  An  are  identical  with  the  roots  of 
Att  and  the  roots  of  Am*  are  all  equal  to  one  or  another  of  the  roots  of  A. 
The  only  natural  frequency,  aside  from  zero,  which  does  not  satisfy 
the  coincidence  condition  is  the  largest  root  of  A.  This  is  the  cut-off. 
The  other  normal  coordinates  fall  within  the  transmitting  band  and 
represent  transfer  constant  controlling  frequencies.  There  are  no 
impedance  controlling  frequencies.  The  frequency  pattern  for  ib  »  5 
is  shown  by  Fig.  22. 


PART  IV.  COMPOSITE  FILTERS 

The  analysis  of  the  preceding  portion  of  the  paper  has  gone  a  con¬ 
siderable  distance  in  the  direction  of  pointing  out  what  kinds  of  filter 
characteristics  are  physically  obtainable  but  it  leaves  two  major  prob¬ 
lems  unsolved.  The  more  obvious  one  is  concerned  with  the  image 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


311 


impedance  at  the  output  terminals  of  the  structure.  As  yet,  of  course, 
we  have  considered  only  the  transfer  constant  and  the  image  impedance 
at  the  input  terminals.  We  have  still  to  determine  in  what  ways  our 
choice  of  these  two  quantities  may  restrict  the  second  image  impedance. 

The  other  problem  has  to  do  with  the  possible  relations  which  may 
exist  between  the  quantities  A/An  and  An/Ann  because  of  the  fact  that 
they  are  derived  from  the  same  four-terminal  network.  It  is  clear  enough 
from  Foster’s  work  on  two-terminal  reactive  structures  that  a  network 
can  be  built  in  which  either  one  of  these  two  functions  can  be  assigned 
any  value  consistent  with  the  general  requirement  that  poles  and  zeros 
must  separate  one  another.  We  have  still  to  discover,  however,  what 
limits  may  exist  in  our  choice  of  the  second  of  these  two  functions  when 
the  first  has  been  fixed.  In  particular,  we  are  concerned  with  finding 
out  whether  the  zeros  and  poles  of  the  second  function  can  always  be  so 
chosen  that  the  coincidence  condition  is  satisfied  when  the  zeros  and 
poles  of  the  first  function  have  been  arbitrarily  arranged.  Our  discus¬ 
sion  of  the  wa3r8  in  which  critical  frequency  coincidences  can  originate 
physically  suggests  that  the  coincidence  conditions  are  not  inherently 
unnatural  or  improbable  ones,  but  it  can  scarcely  be  regarded  as  a  con¬ 
clusive  answer  to  the  question.  The  existence  of  some  limitation  upon 
at  least  the  relative  numerical  values  which  may  be  assigned  to  the 
critical  frequencies  is  still  a  possibility  and  remains  to  be  investigated. 

The  answers  to  these  questions  must  be  found  in  the  last  two  items  in 
our  previous  list  of  the  properties  of  the  determinants  associated  with 
physical  networks.  We  shall  apply  these  last  conditions  somewhat 
indirectly,  using  as  intermediaries  the  set  of  auxiliary  parameters  de¬ 
scribed  below.  The  parameters  are  introduced  partly  becau.se  they 
considerably  facilitate  the  statement  of  the  consequences  to  which  these 
last  conditions  lead.  What  is  still  more  important,  however,  is  the  fact 
that  they  bridge  the  gap  between  the  general  analysis  we  have  followed 
and  the  usual  section-by-section  analysis  of  conventional  ladder  type 
filters.  The  relation  thus  established  will  eventually  indicate  the 
process  by  means  of  which  any  physically  possible  filter  can  be  built. 

The  Syttem  of  Roott  of  Tank  »  1 

The  auxiliary  parameters  we  shall  employ  are  defined  as  the  system  of 
roots  of  the  equation  tanh  ®  »  1.  They  will  be  symbolized  by  ri .  .  .  r,. 
Since  they  are,  of  course,  also  roots  of  the  equation  tanh’  9  »  1,  which 
is  a  rational  expression,  they  can  be  obtained  by  solving  for  the  roots  of 
a  certain  polynomial.  For  the  purposes  of  formal  analysis  we  shall 


312 


H.  W.  BODE 


continue  to  amume  that  the  variable  in  terma  of  which  tanh*  9  is  ex¬ 
pressed  is  X.  In  examples  and  practical  applications,  however,  it  will 
often  be  more  convenient  to  consider  that  the  variable  is  frequency. 
Since  X  »  —  there  will  be  a  pair  of  positive  and  negative  roots  in 
terms  of  /  for  each  root  in  terms  of  X.  Physically,  of  course,  the  roots 
represent  frequencies  at  which  the  attenuation  of  the  network  is  infinite. 

Since  the  roots  are  closely  connected  with  the  section-by-section 
analysis  of  filters,  it  will  be  helpful  to  illustrate  them  by  reference  to 
known  structures.  If  we  consider  the  half-section  of  m-derived  type 
low-pass  filter  shown  by  Fig.  23,  for  example,  we  find  easily  that 


The  variation  of  this  quantity  beyond  the  cut-off,  /„  is  shown  graph¬ 
ically  by  Fig.  24.  The  root,  of  course,  occurs  at  the  frequency/,,,  and 


Fio.  23 

is  of  the  first  order.  The  transfer  constant  of  the  full-section  shown 
by  Fig.  25  can  also  be  calculated  directly.  Since  it  must  be  just  twice 
that  of  the  half-section,  however,  it  is  simpler  to  make  use  of  the  formula 

i 

2  tanh  ^ 

tanh  e  - - —  .  (26) 

1  -f-  tanh*  ^ 

This  expression  is  evidently  zero  when  tanh  9/2  is  zero  or  infinite  and 
attains  its  maximum  value,  unity,  when  tanh  9/2  =  1.  The  new  curve 
is  therefore  that  shown  by  Fig.  26.  Since  it  now  touches,  instead  of 
crossing,  the  line  unity,  the  root  is  a  double  one.  This  correspondence  of 
simple  roots  with  half-sections  and  double  roots  with  full-sections  is 
characteristic  of  all  familiar  filter  sections  and  will  be  found  helpful  in 
interpreting  the  results  of  our  later  discussion.  Roots  of  higher  mul- 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


313 


tiplicity  than  two  may  also  occur,  but  since  they  can  be  regarded  as  a 
superposition  of  simple  and  double  roots  they  need  not  be  given  par¬ 
ticular  consideration. 

This  example  is  simple  enough  but  it  avoids  a  difficulty  which  must 
be  settled  before  we  can  make  satisfactory  progress.  The  quantities 
rt  ...  Tn  have  been  described  as  roots  of  the  equation  tanh  9  1. 

Actually,  however,  they  were  obtained  as  roots  of  tanh*  9=1.  In 
a  sense,  of  course,  it  does  not  matter  which  equation  is  used.  Since 
tanh  9  is  a  double  valued  quantity,  however,  it  is  not  clear  that  one  of 


Fiq.  25 


its  branches,  instead  of  being  equal  to  -f  1  at  all  of  the  points  ri . . .  r,, 
may  not  be  equal  to  4-1  at  some  of  these  points  and  —1  at  others. 
The  difficulty  was  not  obvious  in  the  particular  example  we  chose 
because  there  was  only  one  root  and  that  one  occurred  on  the  real 
frequency  axis,  where  it  is  clear  for  physical  reasons  that  we  must  mean 
by  tanh  9  the  branch  of  the  function  which  gives  positive  rather  than 
negative  attenuation.  In  the  general  case,  where  we  may  have  a 
number  of  roots  occurring  at  both  complex  and  real  frequencies,  how¬ 
ever,  it  is  not  obvious  that  the  branch  which  makes  tanh  9  positive  at 


314 


H.  W.  BODE 


real  frequencies  is  necessarily  the  one  which  makes  tanh  6  -f  1  at  all 
of  the  points  ri . . .  r.. 

The  difficulty  can  be  avoided  most  conveniently  by  representing  tanh  0 
on  a  properly  chosen  Riemann  surface.  We  shall  choose  the  surface 
whose  winding  points  are  the  cut-offs  and  whose  branch  cut  is  the  por¬ 
tion  of  the  negative  real  X-axis  connecting  the  cut-offs.  This  evidently 
is  the  interval  corresponding  to  the  transmitting  band  of  the  filter.  The 
reason  for  this  choice  will  be  understood  if  we  recall  the  facts  we  have 
previously  established  about  the  behavior  of  the  function  tanh*  B  in  the 
transmission  band.  It  will  be  remembered  that  in  this  interval  tanh*  B 
must  be  a  negative  real  quantity  and  that  the  zeros  and  poles  of  tanh*  B 
must  alternate  and  must  be  double  except  at  the  end  points  of  the  inter¬ 
val,  where  they  are  simple.  There  are,  furthermore,  no  zeros  or  poles 


of  tanh*  B  outside  the  transmitting  band.  It  follows  that  as  we  pass 
from  any  zero  to  a  succeeding  pole  in  the  interval,  tanh*  B  will  pass 
through  all  negative  real  values  from  zero  to  —  oo .  From  this  pole  to 
the  next  zero  it  will  go  back  from  —  <»  to  zero,  and  so  on.  All  told,  the 
function  passes  through  any  given  negative  value  at  least  as  many  times 
as  it  has  zeros  or  poles,  when  the  zeros  and  poles  are  counted  with  the 
proper  multiplicity.  Since  a  rational  function  can  assume  any  pre¬ 
scribed  value  only  as  many  times  as  the  number  of  its  zeros  or  poles, 
however,  we  can  conclude  that  tanh*  B  is  never  a  negative  real  quantity 
outside  of  the  transmitting  band. 

Since  tanh*  9  is  a  negative  real  quantity  only  along  the  branch  cuts, 
the  real  component  of  tanh  B  can  be  zero  only  at  these  frequencies. 
Each  branch  of  tanh  B  is,  however,  a  continuous  function  on  its  sheet 
of  the  Riemann  surface,  and  it  therefore  follows  that  the  sign  of  the  real 
component  of  each  branch  will  remain  unchanged  on  its  sheet.  Con- 


r 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  315 


sequently,  if  one  of  the  branches  of  tanh  8  is  equal  to  -|- 1  at  any  of  the 
points  ri . . .  r.  it  must  be  equal  to  + 1  and  not  —  1  at  each  of  the  others. 
Calling  this  branch  tanh  6,  then,  we  are  able  to  justify  our  assumption 
that  the  frequencies  at  which  the  equation  tanh*  9  a  1  is  satisfied  are 
also  frequencies  satisfying  the  equation  tanh  0  ~  1.  This  branch  is,  of 
course,  the  one  which  gives  positive  attenuation  at  real  frequencies  in 
attenuating  bands,  and  therefore  corresponds  to  the  usual  convention 
of  signs  in  these  regions. 


Properties  of  the  Roots 


With  the  help  of  this  exact  definition  we  can  proceed  to  establish 
the  properties  of  the  roots  and  their  application  to  filter  problems.  It  is 
evident  from  equation  (7)  that  the  roots  can  be  defined  analytically  by 
means  of  the  relation 


^11  An 


(27) 


With  the  help  of  equation  (2)  this  expression  can  be  rewritten  as 


All  An 


0. 


(28) 


The  roots  therefore  are  the  values  of  X  at  which  An  >■  0  with  the  excep¬ 
tion  possibly  of  values  at  which  An,  An  and  An  have  common  roots. 
We  find  by  inspection  that  the  roots  must  have  the  following  properties: 

1.  Since  the  coefficients  in  (28)  are  real,  possible  roots  are  either  real 
or  exist  as  conjugate  complex  pairs. 

2.  As  we  have  already  mentioned.  An  and  An  vanish  only  on  the 
negative  real  X-axis.  The  roots  of  Am  however,  need  not  be  confined  to 
this  axis.  Any  complex  or  positive  real  member  of  the  set  ri . . .  r.  must 
therefore  be  a  root  of  An  but  not  of  An  or  An-  Since  An  appears  as  a 
square  in  equation  (28),  all  complex  or  positive  real  roots  must  be  of 
even  multiplicity. 

3.  The  number  of  roots,  counting  both  positive  and  negative,  is  equal 

to  the  degree  of  the  rational  function  Since  and  may 

All  An  All  An 

contain  coincident  zeros  or  poles,  representing  impedance  controlling 
factors,  this  bears  no  necessary  relation  to  the  number  of  meshes  in  the 
circuit.  It  is  easily  seen,  however,  that  the  degree  of  the  expression  will 
be  equal  to  the  number  of  pole-zero  intervals  of  tanh*  0  in  the  transmit¬ 
ting  band,  so  that  the  number  of  roots  can  readily  be  determined  by  this 
means.  Since  the  phase  shift  sweeps  through  t/2  radians  in  each  pole- 


316 


H.  W.  BODE 


lero  interval,  the  number  of  roota  can  also  be  obtained  by  dividing  the 
total  phase  shift  in  radians  by  t/2. 

4.  If  we  let  01  and  0i  be  the  transfer  constants  of  two  structures 
connected  in  tandem  with  matched  image  impedances,  the  transfer 
constant  0  of  the  composite  network  will  be  given  by 


tanh  0  V  tanh  (0i  +  0t) 


tanh  01  -{-  tanh  0i 
1  -f  tanh  01  tanh  0i  * 


(29) 


It  is  clear  from  a  study  of  this  expression  that  tanh  0  will  be  unity 
whenever  either  tanh  0i  or  tanh  0»  is  unity.  We  can  therefore  conclude 
that  the  roots  of  a  composite  structure,  such  as  an  ordinary  composite 
ladder  type  filter,  are  the  sum  of  the  roots  of  all  of  its  constituents. 

()f  these  properties  the  first  and  third  contribute  nothing  essentially 
new  to  the  solution  of  the  problems  enumerated  at  the  beginning  of  this 
portion  of  the  paper.  The  second,  however,  amounts  to  a  restriction 
on  the  choice  of  the  a’s  and  6*8  which  can  be  used  in  the  general  expres¬ 
sions  (13)  and  (14)  if  the  coincidence  condition  is  to  be  satisfied  and  could 
not  have  been  obtained  from  our  earlier  analysis.  One  other  restriction, 
involving  the  choice  between  a  given  rational  function  representation 
of  tanh*  0  and  its  reciprocal,  will  appear  later  in  the  discussion  of  the 
determination  of  the  transfer  constant  from  the  roots.  The  final 
property  of  the  list,  of  course,  indicates  the  connection  between  the 
general  analysis  we  have  been  following  and  the  section-by-section 
analysis  customary  for  conventional  structures.  We  shall  use  the  roots 
in  the  proofs  of  two  theorems  which  will  serve  as  guides  in  all  of  our  later 
discussion. 


Relation  Between  Image  Impedances  and  Transfer  Coristant 

The  first  theorem  states  that,  except  for  an  arbitrary  constant  mul¬ 
tiplier,  the  second  image  impedance  of  any  filter*^  is  uniquely  determined 
at  all  frequencies  by  the  transfer  constant  and  the  first  image  impedance. 
The  arbitrary  constant  multiplier  represents  a  possible  ideal  transformer 
in  the  circuit.  Except  for  such  transformers,  then,  filters  can  be 
analysed  in  terms  of  only  two  parameters  instead  of  the  three  which 
are  usually  considered  necessary. 

The  theorem  is  most  easily  understood  from  a  study  of  the  expression 


**  The  statement  of  the  theorem  haa  been  restricted  to  filters  aa  a  matter  of 
simplicity.  It  is  easily  seen,  however,  that  both  this  theorem  and  the  one  which 
follows  hold  for  general  reactive  networks. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


317 


tanh*  0).  Using  equations  (2),  (5)  and  (7),  we  easily  find  that 


(1  -  tanh»0) 

Zf, 


*^11  / An  Aw  —  AAini\  ^  A% 
Am  \  All  An  /  aJ, 


(30) 


Since  An/An  is  a  rational  function  of  X  with  real  coefficients,  it  is  always 
a  real  quantity  at  real  values  of  X  and  its  square  must  consequently  be 
pxMitive.  One  of  the  two  quantities  Z/,/Z/|  and  1  —  tanh*  $  can  there¬ 
fore  change  sign  along  the  real  axis  only  if  the  other  does  also.  This 
fact  is  sufficient,  together  with  what  we  already  know  about  the  general 
analytic  form  which  the  image  impedances  and  transfer  constant  must 
take  as  functions  of  X,  to  specify  one  of  the  image  impedances  when  the 
transfer  constant  and  the  other  image  impedance  have  been  determined. 

By  inspection  of  our  previous  results,  we  easily  find  that  the  possi¬ 
bilities  of  securing  changes  of  sign  in  Z/,/Zf,  and  1  —  tanh*  8  may  be 
described  as  follows: 

1.  Within  any  transmitting  band  ZiffZn  must  be  positive.  Since 
tanh  8  is  imaginary  in  transmitting  bands,  1  —  tanh*  8  will  also  be 
positive. 

2.  As  we  pass  across  a  cut-off  tanh  8  becomes  real  instead  of  imagi¬ 
nary.  There  are  then  two  cases  to  consider.  If  the  irrational  factor 

,  representing  the  cut-off,  appears  in  the  numerator  of  tanh  8, 

which  means  that  the  phase  shift  at  the  cut-off  is  an  even  multiple  of  t/2, 
the  expression  will  be  very  small  at  frequencies  just  beyond  the  cut-off 
and  1  —  tanh*  8  will,  therefore,  still  be  positive  in  this  range.  On  the 
other  hand,  if  the  cut-off  factor  appears  in  the  denominator  of  tanh  8, 
which  means  that  the  phase  shift  is  an  odd  multiple  of  r/2  at  the  cut-off, 
1  —  tanh*  8  will  change  sign  as  we  pass  across  this  point. 

Both  image  impedances  will,  of  course,  also  contain  the  same  irra¬ 
tional  cut-off  factor.  If  the  cut-off  factors  in  the  two  impedances  are 
both  in  the  numerator  or  both  in  the  denominator  they  will  cancel  out 
when  the  ratio  Zi^/Zn  is  taken.  Under  these  circumstances,  therefore, 
the  ratio  of  the  two  image  impedances  will  not  change  sign  as  we  pass 
the  cut-off.  On  the  other  hand,  if  one  cut-off  factor  is  in  the  numerator 
and  the  other  in  the  denominator,  they  will  multiply  together  instead  of 
cancelling  out.  When  this  condition  occurs,  therefore,  the  ratio  Z;,/Z/| 
will  change  sign  at  the  cut-off.  We  can  consequently  conclude  that  if 
the  cut-off  factor  is  in  the  numerator  of  tanh  8,  it  must  be  either  in  the 
numerator  of  both  image  impedance  expressions  or  in  the  denominator 
of  both  expressions,  and  vice  versa. 


318 


H.  W.  BODE 


3.  At  frequencies  further  out  in  the  attenuating  range  1  —  tanh*  6  will 
change  sign  at  roots  of  odd  multiplicity  of  the  equation  tanh  9  1. 

It  will  not  change  sign,  however,  at  roots  of  even  multiplicity.  The 
ratio  Zif/Zii  will  change  sign  in  this  range  whenever  one  image  impedance 
has  a  aero  or  pole  which  does  not  correspond  to  either  a  zero  or  pole  of 
the  other  image  impedance.  It  will  not  change  sign,  however,  at  critical 
frequencies  which  are  found  in  both  image  impedance  expressions.  We 
can  therefore  conclude  that  simple  roots  of  tanh  9  »  1  represent  fre¬ 
quencies  at  which  one  but  not  both  image  impedances  has  a  zero  or  pole. 
Roots  of  even  multiplicity  of  tanh  6  1,  on  the  other  hand,  can  be 

introduced  at  pleasure  without  limiting  the  image  impedances  and 
coincident  natural  frequencies  can  be  introduced  into  the  image  im¬ 
pedance  expressions  without  limiting  the  transfer  constant. 

It  will  be  recalled  that  impedance  controlling  frequencies  can  originate 
physically  either  because  of  the  presence  of  multiple  roots  in  the  net¬ 
work  determinants  or  because  in  extreme  cases  a  normal  coordinate  of  a 
constrained  system  may  coincide  with  a  normal  coordinate  of  the  system 
before  the  constraint  is  applied.  As  we  mentioned  in  the  course  of  the 
discussion  of  these  possibilities,  the  distinction  between  these  two 
methods  of  obtaining  impedance  controlling  frequencies  is  related  to  the 
distinction  we  have  just  drawn  between  impedance  controlling  fre¬ 
quencies  which  appear  at  both  ends  of  the  network  and  impedance 
controlling  frequencies  appearing  at  one  end  only.  The  relation  can  be 
seen  most  easily  if  we  observ'e  that  the  product  and  ratio  of  Z/,  and  Z/, 
are  respectively  A/Aim  and  An/An-  If  the  same  critical  frequency  factor 
appears  in  both  Z/,  and  Z#,,  it  must  evidently  appear  as  a  double  pole 
or  zero  in  either  their  product  or  their  ratio.  Both  the  product  and  the 
ratio,  however,  are  of  the  first  degree  in  the  determinants.  It  conse¬ 
quently  follows  that  when  the  same  impedance  controlling  factor  appears 
at  both  ends  of  the  network,  one  at  least  of  the  corresponding  network 
determinants  must  have  a  double  root.  Conversely,  it  is  easy  to  estab¬ 
lish  by  inspection  of  such  equations  as  (13)  and  (14)  that  when  an 
impedance  controlling  factor  is  formed  by  the  coincidence  of  multiple 
roots,  it  must  appear  at  both  ends  of  the  structure.  There  is,  therefore, 
an  exact  correspondence  between  the  two  phenomena.  By  the  same 
logic,  of  course,  we  can  prove  that  the  occurrence  of  an  impedance 
controlling  factor  at  only  one  end  of  the  network  is  correlated  with  the 
second  method  of  producing  such  factors.  As  we  might  expect  from 
this  analysis,  when  impedance  controlling  factors  appear  at  both  ends 
of  the  network,  the  structure  will  usually  require  an  appreciably  larger 


J 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


319 


number  of  elements  than  would  a  comparable  network  having  the  same 
transfer  constant  but  simpler  impedance  characteristics. 

AppluxUion  of  the  Image  Impedance  Theorem 

The  relation  established  in  the  preceding  section  together  with 
the  properties  we  have  previously  discovered  for  image  impedances 
make  it  a  simple  matter  to  predict  the  second  image  impedance  (except 
for  the  arbitrary  constant  multiplier  corresponding  to  an  ideal  trans¬ 
former)  as  soon  as  the  first  image  impedance  and  the  roots  ri . . .  r. 
are  known.  An  example  is  furnished  by  Fig.  27,  which  is  supposed 
to  represent  the  frequency  pattern  at  one  end  of  a  band-pass  filter. 
The  corresponding  image  impedance  and  transfer  constant  expressions 

are  Z;,  «  y/kikt  V^Ou  ^  ^  a/^ 

assume,  for  purposes  of  illustration,  that  there  are  three  simple  roots  of 
tanh  9  IB  1.  Two  of  them  are  not  supposed  to  coincide  with  any  of  the 


Fig.  27.  Frequency  pattern  at  input  terminals  of  a  band-pass  filter 

natural  frequencies  represented  by  Fig.  27,  but  they  are  indicated  in 
that  figure,  as  a  matter  of  convenience,  by  the  numbers  2  and  12.  The 
third  simple  root  will  be  assumed  coincident  with  the  anti-resonant 
frequency  10  of  Fig.  27.  Since  there  are  five  “spaces,”  or  intervals 
between  poles  and  zeros,  in  the  expression  for  tanh  9,  there  remain  two 
roots  to  be  accounted  for.  They  will  be  taken  as  forming  together  a 
double  root,  which,  of  course,  will  have  no  influence  upon  the  image 
impedance  relations  of  the  structure. 

The  proper  expression  for  Z/,  is  easily  built  up  step-by-step.  Con¬ 
sidering  first  the  lower  cut-off,  we  observe  that  the  factor  y/oi  appears 
in  the  denominator  of  tanh  9.  It  must,  therefore,  be  found  in  the 
numerator  of  one  image  impedance  and  in  the  denominator  of  the  other. 
Since  it  occurs  in  the  denominator  of  Z/|  it  must  be  in  the  numerator 
of  Z/,.  Below  the  band,  Z/j  contains  the  factors  at  and  at,  which  do 
not  correspond  to  any  simple  roots  of  tanh  9.  They  must  therefore,  be 
found  in  Z/,  also.  On  the  other  hand,  the  simple  root  at  2  must  cor- 


320 


H.  W.  BODE 


respond  to  a  factor  which  is  in  one  image  impedance  but  not  the  other. 
Since  the  factor  is  not  found  in  Z/„  it  must  occur  in  Z/,.  At  the  upper 
cut-off,  the  factor  y/ at  appears  in  the  numerator  of  tanh  d.  It  must, 
therefore,  appear  in  the  same  place  in  both  image  impedance  expressions. 
Above  the  cut-off,  an  represents  a  factor  of  Z/,  which  does  not  cor¬ 
respond  to  a  simple  root  and  must  therefore  appear  in  Z/,  also.  On  the 
other  hand,  the  factor  aw  of  Z/,  does  correspond  to  a  simple  root  and 
will,  consequently,  not  be  found  in  Z/,.  Finally,  the  simple  root  at  12, 
which  does  not  correspond  to  a  factor  of  Z/|,  will  produce  a  factor  of 
Z/,.  By  putting  all  of  these  conclusions  together  and  adding  the  usual 
requirement  on  the  alternation  of  xeros  and  poles  we  find  that  the  second 
image  impedance  can  be  represented  as 


z„  -  Vk'X 


OlOlOll 


Fia.  28.  Frequency  pattern  at  output  terminals  rorresponding  to  input  pat 
tern  of  Fig.  27. 


Since  tanh  6  must  be  the  same  in  either  direction  through  the  structure, 
the  frequeticy  pattern  at  the  second  end  of  the  network  is  easily  found 
from  this  expression  for  Z/,.  It  is  shown  by  Fig.  28. 

Determination  of  Trantfer  Constant  by  the  Roots  of  Tank  9  ^  \ 

Our  second  theorem  states  that  the  transfer  constant  of  any  physically 
realisable  filter  is  uniquely  determined”  by  the  cut-offs  and  the  roots 
ft . . .  r,.  We  can  take  it  for  granted,  of  course,  that  tanh  9  for  the 
structure  under  consideration  must  be  given  as  the  square  root  of  a 
certain  rational  function  of  X.  Our  theorem  is,  therefore,  equivalent 
to  the  statement  that  only  one  such  rational  function  can  be  chosen  when 
the  cut-offs  and  roots  ri . . .  r,  are  known.  The  fact  that  there  always 

**  Strictly  upcaking,  it  is  tanh  S  which  is  uniquely  determined,  and  since 
tanh  (S  +  iV)  —  tanh  S  there  is  still  an  uncertainty  of  w  radians  phase  shift  in  0 
itself.  Since  a  phase  shift  of  this  amount  can  be  provided  merely  by  crossing 
either  the  input  or  output  terminals,  however,  it  will  be  ignored. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  321 

exists  such  a  rational  function,  when  the  necessary  limitations  on  the 
roots  are  observed,  will  be  shown  later. 

Let  a  rational  function  satisfying  the  requirements  be  denoted  by 
At.  Ai  must  of  course  be  of  degree  n  if  there  are  n  roots.  What  we  are 
to  prove  is  that  there  cannot  be  a  second  rational  function  meeting  the 
same  conditions.  Suppose,  on  the  contrary,  that  a  second  rational 
function,  denoted  by  At,  exists.  If  we  let  Xi  and  \t  represent  the 
prescribed  cut-offs,  it  is  evident  that  \/ Ai  and  y/^tt  or  in  other  words, 
the  two  values  of  tanh  d  corresponding  to  the  two  assumed  rational 
functions  must  be  expressible  as 

vOn  -  (X  -  X,)**  (X  -  X,)±*  Qi 

and 

y/At  -  (X  -  X,)=t»  (X  -  X,)*»Q,, 

where  Qi  and  Qt  represent  rational  functions.  The  cut-off  factors 
(X  —  Xi)**  and  (X  —  Xf)**  must  of  course  be  the  same  in  the  two  expres¬ 
sions,  except  for  the  possibility  that  either  exponent  may  be  -|-§  in  one 
expression  and  — }  in  the  other.  In  choosing  the  signs  of  the  square 
roots  we  shall  follow  the  Riemann  surface  representation  outlined  in  a 
previous  paragraph.  The  surfaces  for  the  two  functions  are,  of  coiirse, 
the  same  since  they  are  supposed  to  give  the  same  transmitting  bands. 

We  now  introduce  two  new  functions,  Ri  and  Rt,  defined  by 
R\  »  y/^i  y/Xt  and  Rt  “  y/ A\ly/~Af  Ri  and  Rt  must  obviously  be 
rational  functions  of  frequency  since  such  expressions  as  (X  —  XO^t  will 
either  disappear  or  become  equal  to  (X  —  X|)^'  when  the  product  and  ratio 
ofy/Ai  and  y/At  are  taken.  Moreover,  they  will  be  at  most  of  the  nth 
degree  before  common  factors  are  cancelled,  since  both  A  i  and  A  i  are  of 
the  nth  degree.  Obviously,  however,  any  such  factor  as  (X  —  Xi)** 
must  disappear  in  either  the  product  or  ratio  of  y/Ai  and  y/ At  and 
either  Ri  and  Rt  must  therefore  actually  be  of  less  than  the  nth  degree. 
Both  Ri  and  Rt  must  nevertheless  be  equal  to  unity  at  each  of  the  n 
points,  fi . . .  r„  since  y/Ai  and  \/ At  are  individually  unity  at  these 
points.  It  follows  that  either  Ri  or  Rt  must  be  equal  to  unity  identi¬ 
cally.  Consequently,  if  two  functions,  A  i  and  A  t,  exist  corresponding  to 
the  given  roots  and  transmitting  bands,  they  must  either  be  the  same  or 
reciprocals  of  one  another. 

Finally,  we  may  show  that  the  possibility  that  Ai  and  At  may  be 
reciprocals  must  be  abandoned.  Consider  the  functions  1  —  >li  and 
1  —  i4i,  or,  in  other  words,  1  —  tanh*  0.  From  equation  (2),  1  —  tanh*  0 


322 


H.  W.  BODE 


must  be  equal  to  A}, /An  An.  We  have  already  shown  that  An  and  An 
must  be  positive  real  quantities  when  X  is  a  positive  real  quantity  if  the 
network  is  to  be  physically  realisable.  Since  An  will  at  least  be  real, 
when  X  is  real,  its  square  will  also  be  a  positive  real  quantity.  The 
complete  expression  for  1  —  tanh*  9  must  consequently  be  positive  along 
the  positive  real  X-axis.  But  if  A \  and  A%  are  reciprocals  it  is  obviously 
impossible  for  1  —  Ai  and  1  —  ilt  to  be  positive  simultaneously.  We 
conclude,  therefore,  that  there  can  be  only  one  function  corresponding  to 
any  given  choice  of  roots  and  cut-offs. 

As  an  example  of  this  theorem,  we  will  take  a  low-pass  structure 
having  the  transfer  constant 


tanh  9 


i2.675  X 


(1  -  1.264  x«) 

(1  -  3.383  X*)  VnTF* 


where  x  *  ///*. 

The  roots  are  found  by  setting  this  equation  equal  to  one. 
both  sides,  then,  gives  us 


(31) 

Squaring 


(1  -  3.383  x*)*(l  -  -  -2.675*  x*(l  -  1.264  x*)*  (32) 


which  readily  reduces  to 

(x»  -  4.00)*  (x*  -  9.00)  -  0.  (33) 


The  equation  therefore  has  a  double  root  at  x*  »  4  and  a  simple  root  at 
X*  —  9.  As  we  saw  previously,  however,  full-section  and  half-section 
low-pass  filters  of  the  conventional  m-derived  type  give  respectively 
double  and  simple  roots  at  a  frequency  which  is  related  to  m  by 
/*  1 

~  - .  We  also  noticed  that  a  combination  of  two  structures  in 

/!  1  -  m* 

tandem  will  have  foe  its  roots  all  of  the  roots  of  each  of  the  structures 
taken  separately.  We  can,  therefore,  obtain  the  prescribed  double  root 
at  X*  4  and  the  simple  root  at  x*  *>  9  by  connecting  a  full-section 
with  m  —  0.866  in  tandem  with  a  half-section  with  m  »  0.943,  as  shown 
by  Fig.  29.  It  follows  from  our  general  theorem  that  the  transfer 
constant  of  this  structure  will  be  the  same  as  that  given  by  (31)  at  all 
frequencies.  If  we  build  it  in  this  form,  however,  we  must  of  course, 
accept  the  relatively  simple  image  imiiedance  characteristic  which  the 
structure  gives. 

The  General  Composite  Filter 

The  example  we  have  just  discussed  suggests  at  once  that  we  may 
always  be  able  to  identify  the  various  roots  of  tanh  9  >  1  with  partic- 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


323 


ular  elementary  etnicturee.  If  this  can  be  done  our  whole  discussion  of 
filters  will  be  much  simplified.  Every  filter,  whatever  its  actual  physical 
configuration  may  be,  can  then  be  regarded  as  a  composite  of  certain 
elementary  constituents,  which  can  be  combined  together  like  so  many 
bricks  to  produce  any  particular  final  result.  Since  the  elementary 
constituents  are  fixed  by  the  general  analysis,  the  properties  of  particular 
configurations  will  not  enter  into  the  discussion.  Moreover,  since  we 
need  consider  only  real  roots  and  pairs  of  conjugate  complex  double 
roots  the  elementary  constituents  will  exist  in  only  a  limited  variety 
and  all  possible  filter  characteristics  can  therefore  be  built  up  by  com¬ 
bining  together  elementary  characteristics  of  a  few  simple  forms.  It 
also  follows,  of  course,  that  in  order  to  show  that  any  filter  can  be 


m>0.«43 


Fio.  29 


Fio.  30.  The  general  compoeite  filter 


constructed,  we  need  only  show  how  the  elementary  constituents  can  be 
built. 

Naturally,  the  conventional  ladder  sections  employed  in  our  preceding 
example  cannot  be  used  to  represent  the  complex  and  positive  real  roots 
required  in  the  general  analysis.  Moreover,  if  the  complete  filters  are 
to  have  a  wide  variety  of  image  impedances  we  must  be  able  to  assign 
more  general  forms  for  the  impedances  of  the  constituent  sections  than 
are  obtainable  from  usual  ladder  sections.  If  we  assume  that  appro¬ 
priate  configurations  for  the  necessary  elementary  sections  can  be 
found,  however,  the  complete  network  can  be  built  up  step-by-step. 
The  process  is  illustrated  by  Fig.  30.  One  image  impedance,  Z/i,  and 
the  roots  ri . . .  r«  are  supposed  to  be  prescribed.  We  will  also  assume, 
for  simplicity,  that  all  of  the  roots  occur  on  the  negative  real  X-axis. 
The  first  elementary  section,  then,  is  constructed  to  give  the  root  ri 


h 


324 


H.  W.  BODE 


and  to  have  one  image  impedance  equal  to  the  prescribed  Z/,.  At 
the  other  end  of  this  section  we  will  have  some  other  image  impedance, 
Zf,,  which  by  our  preceding  theorem  can  be  determined  from  Z/|  and 
rt.  The  second  section  provides  rt  and  has  one  image  impedance  equal 
to  Z/..  At  the  other  end  of  this  section,  we  have  still  a  third  image 
impedance,  Z^.  The  third  section  matches  Z/^  and  gives  the  third 
root,  and  so  on.  The  final  image  impedance  Z/,  of  the  complete 
structure  will  then  be  whatever  the  prescribed  original  impedance  Z/i 
and  the  prescribed  set  of  roots  ri . . .  dictate.  Since  positive  real 
roots  and  complex  roots  must  be  double,  while  complex  roots  must  in 
addition  occur  in  conjugate  pairs,  they  cannot  be  represented  individu¬ 
ally  by  elementary  sections  as  this  procedure  suggests.  If  we  introduce 
them  in  the  combinations  in  which  they  naturally  occur,  however,  the 
process  can  be  followed  for  them  as  well  as  for  negative  real  roots. 

If  this  method  of  constructing  filters  is  a  practicable  one,  we  must 
evidently  be  able  to  provide  elementary  sections  which  will  represent 
any  negative  root,  any  double  positive  root,  or  any  pair  of  conjugate 
complex  double  roots  and  which  will  have  one  image  impedance  in  any 
form  consistent  with  the  general  limitations  imposed  by  our  earlier 
anal3r8is.  In  the  succeeding  portions  of  the  paper  we  will  show  how 
elementary  structures  meeting  these  requirements  can  be  constructed 
for  each  type  of  filter.  Our  success,  of  course,  shows  that  the  conditions 
of  physical  realizability  which  we  have  determined  are  sufficient  as 
well  as  necessary. 

The  elementary  sections  have  been  divided  for  convenience  into  sym¬ 
metrical  and  unsymmetrical  types.  The  unsymmetrical  sections,  of 
course,  correspond  to  simple  roots,  which  are  found  only  on  the  negative 
real  X-axis.  They  are  discussed  in  Part  VI.  The  symmetrical  struc¬ 
tures  represent  double  roots  and  are  described  in  Part  V.  They  have 
been  divided  into  structures  representing  negative  double  roots,  struc¬ 
tures  representing  positive  double  roots,  and  structures  representing 
conjugate  pairs  of  double  complex  roots.  The  negative  double  roots 
can,  of  course,  also  be  represented  by  combinations  of  two  unsym¬ 
metrical  structures  in  tandem,  each  corresponding  to  a  simple  root. 
They  are  given  separate  attention,  however,  because  of  their  frequent 
occurrence  in  practice.  The  elementary  sections  which  have  been  used 
are  all  of  the  general  ladder  or  lattice  types.  So  far  as  possible  they  are 
of  familiar  sorts  and  have  been  described  in  familiar  terminology.  In 
order  to  complete  the  list,  however,  it  has  also  been  necessary  to  include 
a  few  unfamiliar  configurations. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


325 


Before  turning  to  the  detailed  description  of  the  elementary  sections 
it  may  be  desirable  to  emphasise  one  more  point.  The  composite  filter 
of  the  general  t3rpe  we  have  described  may  exist  in  a  wide  variety  of 
equivalent  forms.  The  choice  of  the  most  suitable  form  is  an  important 
practical  problem  although  it  is  beyond  the  scope  of  this  paper.  This 
flexibility  in  physical  configuration  is  due  in  part  to  the  fact  that  the 
elementary  structures  themselves  may  be  constructed  in  several  different 
ways.  In  part  also  it  is  due  to  the  fact  that  it  is  frequently  possible  to 
combine  a  number  of  the  elementary  sections  together  to  form  more 
complicated  networks.  If  we  like,  therefore,  the  composite  filter  can  be 
made  up  of  a  few  large  units  instead  of  many  small  ones. 

Besides  all  these  possibilities,  we  can  change  the  physical  configuration 
of  the  structure  by  changing  the  order  in  which  the  sections  occur. 
From  the  point  of  view  of  our  general  analysis  it  is  immaterial  what 
order  is  adopted  for  the  constituent  sections.  The  requisite  structures 
can  be  built  whatever  arrangement  is  chosen.  In  practice,  however,  one 
order  may  well  be  preferable  to  another.  Naturally,  sections  having 
complicated  image  impedance  characteristics  will  ordinarily  be  more 
expensive  than  those  having  simpler  characteristics.  In  arranging  the 
sections,  therefore,  it  is  desirable  to  make  the  required  image  imped¬ 
ances,  on  the  whole,  as  simple  as  possible.  Since  only  the  simple  roots, 
corresponding  to  unsymmetrical  structures,  make  any  change  in  the 
image  impedance,  the  problem  is  that  of  choosing  their  position  in  the 
composite  whole.  Evidently,  the  best  arrangement  will,  in  general, 
be  one  which  places  each  uns3rmmetrical  structure  near  the  end  at  which 
the  impedance  controlling  frequency  corresponding  to  its  peak  of  attenu¬ 
ation  occurs,  and  leaves  the  symmetrical  constituents  to  form  the 
central  portion  of  the  filter.  With  this  arrangement,  the  central 
sections  will  exhibit  only  the  impedance  controlling  frequencies  found  in 
all  parts  of  the  filter  structure.  In  a  practical  design  it  may  well  be 
expedient  to  go  farther  than  this.  By  making  some  sacrifice  of  general¬ 
ity  we  can ’choose  simple  roots  coincident  with  most,  if  not  all,  of  the 
impedance  controlling  frequencies.  By  placing  these  simple  roots  near 
ends  of  the  structure,  then,  most  of  the  filter  can  be  built  with  a  very 
simple  image  impedance  characteristic. 

PART  V.  SYMMETRICAL  CONSTITUENTS  OP  THE  COMPOSITE  FILTER 

The  transfer  constants  required  from  the  elementary  symmetrical 
constituents  of  the  general  composite  structure  are  those  which  give  rise 
to  double  negative  real  roots,  double  positive  real  roots,  and  double  pairs 


k 


326 


H.  W.  BODE 


of  conjugate  complex  roota.  If  the  variable  be  taken  as  frequency 
rather  than  X  the  corresponding  classification  is  that  into  pairs  of  double 
roots  at  positive  and  negative  real  frequencies,  pairs  of  double  roots  at 
positive  and  negative  imaginary  frequencies,  and  quartets  of  double 
roots  at  positive  and  negative  conjugate  complex  frequencies.  The 
sections  corresponding  to  each  of  these  various  t}rpe8  of  roots  will  be 
obtained  fiom  prototype  structures  by  a  transformation  analogous  to 
Zobel’s  m-derivation.  In  order  to  obtain  the  requisite  variety  of 
transfer  constants,  however,  it  will  be  necessary  to  generalize  the  m- 
derivation  to  include  not  only  the  m’s  between  zero  and  one  originally 
contemplated  by  Zobel,  but  any  real  m’s  and  pairs  of  conjugate  complex 
m’s  lying  in  the  right  half  of  the  plane.  The  three  classes  of  roots  then 
correspond  respectively  to  sections  derived  with  real  m’s  less  than  one, 
sections  derived  with  real  m’s  greater  than  one,  and  pairs  of  sections 
derived  with  conjugate  complex  m’s. 

For  the  sake  of  analytic  simplicity  the  protot3rpe  sections  will  be 
assumed  to  be  lattice  structures  and  all  of  our  m-derivations  will  be 
carried  out  in  terms  of  that  configuration.**  The  use  of  lattice  structures 
is  convenient  both  because  it  allows  us  to  perform  all  of  the  necessary 
m-derivations  without  obtaining  non-physical  elements  and  because  in 
networks  of  this  type  it  is  a  simple  matter  to  associate  a  given  transfer 
constant  with  any  required  image  impedance  characteristic.  Other¬ 
wise,  however,  our  procedure  will  be  very  similar  to  that  adopted  for 
familiar  ladder  networks.  The  connection  between  the  two  methods  of 
analysis  will  be  brought  out  more  clearly  in  a  later  section  devoted  to 
showing  the  equivalence  between  certain  simple  lattice  networks  and 
standard  ladder  structures. 

Since  the  lattice  is  frequently  an  inconvenient  configuration  to  con¬ 
struct  physically  it  wbuld  be  desirable  for  practical  purposes  to  extend 
the  discussion  of  ladder  type  equivalences  for  the  lattice  to  include  also 
the  equivalent  configurations  which  may  be  used  when  the  conversion 
to  a  ladder  network  is  impossible.  The  consideration  of  these  alter¬ 
native  configurations  is  unfortunately  beyond  the  scope  of  this  paper. 
We  may  mention,  however,  that  they  can  usually  be  treated  by  Bart- 

*'  Mr.  O.  J.  Zobel  has  pointed  out  to  the  writer  that  many  of  the  resulting 
structures  are  similar  in  physical  configuration  to  certain  networks  described  in 
hisU.  8.  Pat.  a  1,603,305,  though  otheiwise  unpublished.  They  are,  of  course, 
in  the  same  sense,  special  cases  of  the  general  lattice  structures  described  by  Cauer 
and  others  in  the  references  previously  cited.  Their  particular  importance  for 
the  present  discussion,  however,  lies  in  their  connection,  via  the  m-derivation, 
with  the  theory  of  the  general  composite  filter  developed  in  the  preceding  section. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


327 


lett’s  bisection  theorem.*^  The  variety  of  possibilities  is  much  increased 
if  we  include  the  configurations  which  may  be  found  by  combining  two 
or  more  simple  lattice  structures  in  tandem  before  the  bisection  theorem 
is  applied. 

General  Theory  of  the  Lattice 

The  lattice  or  Wheatstone  bridge  configuration  is  shown  by  Fig.  31, 
the  branch  impedances  being  represented  by  Z,  and  Z„.  The  equations 
for  its  image  impedance  and  transfer  constant  are  known**  to  be 

Z,  -  VZJl  (34) 

and 


It  is  easy  to  see  either  from  the  figure  or  from  the  equations  that  inter¬ 
changing  Z,  and  Zy  amounts  merely  to  crossing  the  terminals  at  one  end 


Fia.  31.  Lattice  or  Wheatstone  Bridge  configuration 

of  the  structure.  In  other  words,  it  introduces  a  constant  phase  shift 
of  T  radians  without  otherwise  affecting  the  performance  of  the  network. 

The  expression  for  Z/  given  by  (34)  is  exactly  the  same,  as  a  function 
of  Z<  and  Z^,  as  is  the  general  expression  of  equation  (8)  for  Z/  in  terms 
of  Zbc  and  Zoc,  while  the  transfer  constant  expression  of  equation  (35) 
is  again  the  same  as  the  general  expression  in  terms  of  Zac  and  Zoc, 
except  that  it  involves  0/2  instead  of  0.  If  we  identify  Z«  with  Zac 
and  Zg  with  Zoc,  therefore,  we  can  apply  our  previous  general  analysis 
in  terms  of  natural  frequencies  directly  to  the  lattice.  It  is  merely 
necessary  to  double  the  transfer  constant  which  finally  results. 

A.  C.  Bartlett  “Theory  of  Electrical  Artificial  Lines  and  Filters,”  p.  28. 

•*  See  Campbell,  “Physical  Theory  of  the  Electric  W’ave  Filter,”  loc.  cit. 


k 


328 


H.  W.  BODE 


The  fact  that  0/2  appears  instead  of  0  in  equation  (35),  is  a  reflection 
of  the  fact  that  the  lattice  is  symmetrical  and  that  its  roots,  therefore, 
are  double.  In  order  to  calculate  them  we  need  merely  double  the 
multiplicity  of  the  roots  obtained  by  setting  (35)  equal  to  unity,  since 
it  is  easy  to  show  that  a  simple  root  of  tanh  0/2  »  1  is  always  a  double 
root  of  tanh  0  «■  1.  As  a  result  of  this  symmetry  the  analysis  of  the 
lattice  in  terms  of  critical  frequencies  is  subject  to  fewer  restrictions  than 
was  our  previous  general  discussion .  The  restrictions  which  are  avoided 
are  the  two  dealing  respectively  with  the  allowable  multiplicity  of 
complex  roots  and  with  the  value  of  the  transfer  constant  expression 
along  the  positive  real  X-axis,  which  turned  up  in  the  analysis  of  Part  IV. 
The  only  restrictions  which  need  concern  us  in  the  lattice  are  those 
obtained  in  Part  III,  as  a  consequence  of  the  separation  condition  on 
the  normal  coordinates  of  constrained  and  unconstrained  systems. 
Perhaps  a  simple  way  of  stating  the  same  result  is  to  say  that  since 
Z,  and  Zy  are  merely  two  independent  two-terminal  reactances  the  only 
condition  we  need  consider  in  forming  the  expressions  for  Z/  and 
tanh  0/2  in  the  lattice  are  those  given  by  Foster’s  theorem.  If  we  once 
grant  the  restriction  of  symmetry,  therefore,  the  lattice  can  be  used  to 
exemplify  all  of  the  critical  frequency  relations  suggested  by  the  dis¬ 
cussion  of  Part  III. 


Special  Properties  of  the  Lattice 


In  our  future  analysis  we  shall  require  three  further  properties  of  the 
lattice.  The  first  has  already  been  suggested  by  the  discussion  of  the 
preceding  section.  It  can  be  expressed  by  the  statement  that  if  the 
equations  for  Z|  and  tanh  0/2  in  the  lattice  individually  meet  the 
requirements  laid  down  for  the  general  image  impedance  and  transfer 
constant  expressions  in  Part  III,  the  branches  of  the  lattice  will  be 
physically  realisable.  In  other  words,  if  Z  /  and  tanh  0/2  meet  these 
requirements,  the  branch  impedances,  as  determined  from  (34)  and  (35) 


by  the  equations  Z.  «  Z  /  tanh  0/2  and  Z^ 


Z, 


will  satisfy  the 


tanh  0/2 

restrictions  of  Foster’s  theorem.  The  statement  is  easily  verified  by 
trial.  It  is  especially  important  for  the  purposes  of  the  present  discus¬ 
sion  because  it  shows  that  no  particular  attention  need  be  paid  to  the 
problem  of  providing  the  required  image  impedance  characteristics  in 
t  he  lattice.  If  structures  with  the  necessary  transfer  constant  character¬ 
istics  can  be  found,  the  requisite  image  impedances  can  be  obtained 
simply  by  multiplying  the  impedance  of  each  lattice  branch  by  ap¬ 
propriate  resonant  and  anti-resonant  factors. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


329 


A  second  important  relation  in  the  theory  of  lattice  structures  is  that 
between  two  simple  lattice  structures  in  tandem  and  a  single  more 
complicated  lattice.**  It  is  assumed  that  all  three  sections  have  the 
same  image  impedance.  To  obtain  the  equivalence,  we  will  suppose 
that  $i  and  8t  are  the  transfer  constants  of  the  two  simple  sections,  that 
6  is  the  transfer  constant  of  the  single  section  replacing  them,  and  that 
Zi  is  the  conunon  image  impedance  of  all  three  sections.  Then  since 
—  ffi  +  $i,  we  have 


Zi  tanh  ^  Zi  tanh  ^  ^ 

«  A 


Z, 


tanh  ^  -f  tanh  ^ 

«  A 

1  4-  tanh  ^  tanh  ^ 

«  A 


(36) 


Fia.  32.  Combining  formula  for  lattice  networks 


which  can  be  rewritten  as 

Zi  tank  ^ - - - - - - .  (37) 

Zi  tanh  5  -b  Z,  tanh  J  ^  ^ 

'  "  tanh'j  tanh^ 

2  2 

**  The  equivalent  conSguration  shown  in  Fig.  32  is  in  a  form  particularly  suit¬ 
able  for  the  problem  to  which  we  will  later  apply  it.  Another,  leas  suitable,  con¬ 
figuration  for  the  lattice  equivalent  of  the  two  elementary  sections  was  discovered 
by  Cauer  (“Physics,"  Apr.  1932,  p.  255).  At  first  sight  it  may  seem  surprising 
that  two  different  configurations  for  the  same  equivalent  should  exist.  The 
apparent  anomaly  is  explained  however,  if  it  is  observed  that  the  four  impedances 
in  each  branch  of  the  equivalent  lattice  of  Fig.  32  can  be  considered  as  constituting 
a  bridge,  the  galvanometer  arm  in  one  bridge  being  open-circuited  while  in  the 
other  it  is  short-circuited.  When  the  image  impedances  of  the  original  sections 
are  equal,  however,  each  of  these  bridges  is  balanced  and  it  makes  no  difference 
what  impedance  is  placed  in  the  galvanometer  arm.  The  Cauer  configuration  is 
the  result  obtained  when  both  galvanometer  arms  are  short-circuited. 


330 


H.  W.  BODE 


It  is  clear,  however,  that  the  quantity  Z  /  tanh  0,  2  must  be  the  Z,  branch 
of  the  final  structure.  Similarly,  the  other  quantities,  Zi  tanh  0i/2, 
Z I  tanh  0i/2,  etc.,  in  the  expression  represent  the  Z,  and  Z,  branches  of 
the  constituent  lattices.  If  we  write  these  as  Z.„  Z^,  Z^  and  Z„,  then, 
equation  (37)  becomes 

Z.  -  - - - - - - -  (38) 

Zmi  +  Zrt  -f 

which  represents  the  configuration  shown  by  the  series  branch  of  the 
structure  on  the  right-hand  side  of  Fig.  32.*^  The  configuration  of  the 
Zy  branch  of  this  structure  can  be  established  similarly. 


One  further  relation  must  be  mentioned.  Suppose  we  multiply  the 
series  branches  and  divide  the  shunt  branches  of  the  original  lattice  by  a 
constant  m  thus  securing  the  structure  of  Fig.  33.  It  is  obvious  from 
equations  (34)  and  (35)  that  the  new  structure  will  have  the  same 
image  impedance  as  the  old  one,  while  its  transfer  constant  will  be 
related  to  that  of  the  original  lattice  by  the  formula 

tanh  ^  «  m  tanh  ^  (39) 

where  the  subscripts  m  and  k  refer  respectively  to  the  new  structure 
and  the  original  structure.  These  effects  upon  the  image  impedance 
and  transfer  constant  are  formally  the  same  as  those  found  in  m-deriving 
a  ladder  section  in  the  usual  manner.  We  can  therefore  consider  that 
Fig.  33  represents  an  m-derivation  of  the  lattice  structure  similar  to  the 
m-derivations  familiar  in  conventional  filter  design.  In  some  respects, 

”  The  broken  lines  in  this  and  subsequent  lattice  drawings  indicate  Z«  and  Z, 
impedances  similar  to  those  shown  explicitly. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


331 


however,  the  derivation  thus  described  is  more  general  than  that  to 
which  we  are  accustomed.  For  example,  the  m  of  Fig.  33  can  be  any 
positive  real  quantity  and  need  not  be  restricted  to  the  range  between 
zero  and  one.  Moreover,  the  derivation  can  be  applied  to  all-pass 
sections  as  well  as  to  filter  structures.  It  should  also  be  observed  that 
its  effects  on  the  attenuation  and  phase  characteristics  of  the  structure 
resemble  those  to  which  we  are  accustomed  in  ladder  networks  only 
when  the  analytic  form  for  the  tanh  9t/2  which  appears  in  (39)  is  as  simple 
as  it  is  in  ladder  sections.”  Since  our  prototype  lattice  structure  will 
always  have  transfer  constants  of  these  simple  types  however,  this 
remark  has  no  direct  application  to  the  sections  with  which  we  shall 
have  to  deal. 


Prototype  Lattice  Sections 

It  is  evident  from  our  discussion  of  the  relation  between  the  lattice 
and  the  general  critical  frequency  analysis,  or  from  the  combining 
formula  just  mentioned,  that  lattice  filter  sections  of  great  complexity 
can  be  built  up.  If  the  filter  is  8}anmetrical,  we  can,  in  fact,  represent  it 
completely  by  a  single  lattice  structure.  When  the  overall  performance 
of  the  filter  must  be  adjusted  with  great  care,  it  is  frequently  convenient 
to  assume  that  the  filter  is  a  single  lattice  in  the  preliminary  stages  of 
the  design.”  Since  the  primary  end  of  this  paper,  however,  is  to 
determine  the  simplest  characteristics  which  go  to  make  up  a  filter  of 
general  configuration  and  to  show  that  they  exist  only  in  limited  variety, 
we  will  be  concerned  here  only  with  lattice  structures  of  fairly  simple 
types.  The  emphasis  placed  upon  simple  lattices  may  also  be  justified 
upon  the  grounds  that  from  them  all  more  complicated  lattices  can  be 
built  up  by  means  of  the  combining  formula  previously  described  and 
that  it  is  only  the  simplest  lattice  structures  which  are  well  suited  for 
purposes  of  practical  construction.  This  last  point  is  discussed  at 
greater  length  in  the  conclusion. 

The  complexity  of  the  lattice  will,  of  course,  depend  upon  the  number 
of  transfer  constant  controlling  frequencies  it  contains.  The  simplest 
possible  structures  are  found  when  the  transfer  constant  expression 

**  That  is,  when  it  contains  no  transfer  constant  controlling  factors.  Otherwise 
the  derivation  produces  results  more  nearly  akin  to  the  complex  m-derivation 
described  later.  It  is  easy  to  show,  for  example,  that  the  complex  m-sections  can 
also  be  produced  by  deriving  two  prototype  lattice  sections  with  real  values  of  m, 
combining  them,  and  deriving  the  combination  with  another  real  m. 

**  See  for  example,  the  analysis  in  the  paper  on  “Ideal  Filters.” 


332 


H.  W.  BODE 


contains  no  such  frequencies  at  all.  Since  we  will  examine  only  sections 
giving  one  double  root,  we  can  restrict  our  attention  to  these  cases. 
With  the  help  of  the  general  analysis,  then,  we  find  that  the  transfer 
constant  expressions  of  interest  are  the  following: 


Low-Pass; 


(40) 


It  will  be  seen  that  the  two  band-pass  expressions  are,  except  for  a 
constant  multiplier,  merely  reciprocals  of  one  another.  Since  taking 
the  reciprocal  of  tanh  9/2  in  the  lattice  is  equivalent  merely  to  croa^ng 
either  the  input  or  the  output  terminals,  the  two  expressions  will  not 
lead  to  essentially  different  physical  results  as  long  as  we  adhere  to  the 
lattice  configuration.  Both  have  been  included,  however,  in  order  to 
facilitate  our  later  discussion  of  the  ladder  equivalents  of  these  struc¬ 
tures.  The  reciprocal  expressions  might  also  have  been  written  down 
for  each  of  the  other  filter  types,  but  in  these  cases  only  the  forms  shown 
are  of  interest  in  determining  the  ladder  equivalent. 

In  the  low-pass  and  high-pass  expressions  f,  represents  the  cut-off, 
while  in  the  all-pass  structure  it  is  an  arbitrary  real  constant  determining 
the  unit  of  frequency.  In  the  band-pass  expressions  the  lower  and  upper 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


333 


cut-offs  are  represented  respectively  by/i  and/i.  In  each  case,  m  is  the 
parameter  introduced  in  the  preceding  section.  We  shall  consider  the 
“prototype”  transfer  constant  expressions  to  be  those  found  when  m  «  1 . 

The  physical  configurations  of  the  corresponding  lattice  structures 
will  of  course  depend  upon  the  image  impedances  we  assign  to  them. 


FILTER  CLASS 

1 

CONFIGURATION  |  Zt 

CONFIGURATION 

LOW-PASS 

K 

W 

HIGH- PASS 

- rti 

K 

ALL- PASS 

K 

— 

BANO-PASS; 

Type  a 

■^■11 

iHf 

c^- - 

yn 

.  ^ 
iKf  \  ^11 

,  Tf 

BANO-PASSi 

TYPE  B 

SAME  AS  TYPE  A  WITH  2* 

AND  Zy  BRANCHES  INTERCHANGED 

Fia.  34.  Lattice  confiaurations  for  elementary  symmetrical  constituents  of 
the  general  composite  filter. 


The  individual  consideration  of  all  of  the  possible  configurations  which 
might  be  found  is  evidently  a  considerable  undertaking  and  would  con¬ 
tribute  very  little  to  this  discussion.  The  configurations  which  are 
found  when  the  image  impedance  contains  no  impedance  controlling 
frequencies  have,  however,  been  listed  and  are  shown  by  Fig.  34.  Even 


334 


H.  W.  BODE 


with  this  restriction,  there  are  a  number  of  possibilities  for  each  type  of 
filter,  since  we  have  still  to  choose  whether  each  cut-off  factor  in  the 
image  impedance  is  to  be  found  in  the  numerator  or  denominator. 
The  configurations  are  supposed  to  include  those  obtained  from  them 
merely  by  interchanging  the  Z.  and  Zy  branches. 

M ‘Derivation  of  Prototype  Lattice  Structuree 

Each  of  the  expressions  for  tanh*  tf/2  which  we  secure  from  the  above 
table  is  a  first  degree  rational  function  in  p.  It  follows,  therefore,  that 
each  expression  will  give  a  single  root  of  the  equation  tanh  9/2  «  1,  which, 
of  course,  will  then  be  a  double  root  of  the  equation  tanh  9  »  1.  In 
each  expression,  moreover,  the  quantity  which  can  be  chosen  arbitrarily 
is  m.  In  order  to  secure  a  root  at  any  prescribed  frequency,  we  must 
consequently  assign  m  a  value  which  makes  tanh  9/2  1  at  that  point. 

If,  for  example,  we  consider  in  particular  the  low-pass  structure,  we 
readily  find  that  the  required  m  is 


where  /«  represents  the  desired  root.  If,  then,  is  assigned  a  real 
value  in  the  attenuating  range,  we  see  at  once  that  the  corresponding  m 
is  a  real  quantity  <  1.  If  /«  is  a  pure  imaginary,  the  corresponding  m 
is  a  real  quantity  >  1,  while  if  /„  is  complex,  m  will  also  be  complex. 

•  Moreover,  it  is  apparent  that  conjugate  complex  values  of  correspond 
to  conjugate  complex  m’s. 

This  same  analysis  can  be  conducted  for  all  of  the  structures.  The 
results  are  tabulated  below : 

1.  Low-Pats  and  High-Pass  Filler t 

(a)  Real  r’s  in  attenuating  ranges  correspond  to  real  m’s  <  1. 

(b)  Imaginary  r’s  correspond  to  real  m’s  >  1. 

(c)  Conjugate  complex  r’s  correspond  to  conjugate  complex  m’s. 

2.  AU-Pass  Structures 

(a)  Imaginary  r’s  correspond  to  real  m’s. 

(b)  Conjugate  complex  r’s  correspond  to  conjugate  complex  m’s. 

3.  Band-Pass  Structures — Type  A 

(a)  Real  r’s  between  aero  frequency  and  the  lower  cut-oflf,  /i, 

correspond  to  real  m’s  <  1. 

(b)  Real  r’s  between  the  upper  cut-off,  ft,  and  infinite  frequency 

correspond  to  real  m’s  >  ft/fv 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


335 


(c)  Imaginary  r’s  correspond  to  real  m’s  between  1  and  /*//i. 

(d)  Conjugate  complex  r’s  correspond  to  conjugate  complex  m’s. 
4.  Band-Pass  Structures — Type  B 

(a)  Real  r’s  between  infinite  frequency  and  the  upper  cut-off,  ft, 

correspond  to  real  m’s  <  1. 

(b)  Real  r’s  between  the  lower  cut-off, /i,  and  aero  frequency  cor¬ 

respond  to  real  m’s  >  ft/fu 

(c)  Imaginary  r’s  correspond  to  real  m’s  between  1  and  /*//i. 

(d)  Conjugate  complex  r’s  correspond  to  conjugate  complex  m’s. 

Physical  Construction  of  Complex  m-Dertved  Sections 

It  is  apparent  that  the  construction  of  m-derived  sections  with  real 
values  of  m  presents  no  difficulty.  To  obtain  such  a  section  we  need 
merely  multiply  each  impedance  in  the  prototype  section  by  an  appro- 


Fio.  35.  Combination  of  m-derived  lattires 

priate  real  constant.  Since  a  section  derived  with  a  complex  value  of  m 
will  naturally  contain  complex  elements,  on  the  other  hand,  it  does 
not  appear  that  the  complex  m-derivations  required  in  the  above  table 
^ill  lead  to  a  physical  result.  We  can  at  least  go  through  the  formal 
process,  however,  and  the  final  result  will  be  physical  if  we  add  together 
the  two  complex  m-sections  which  correspond  to  conjugate  complex 
roots.  This  is  illustrated  by  Fig.  35.  On  the  left-hand  side  of  the 
figure  the  two  structures  are  supposed  to  have  been  derived  from  the 
protot3q)e  section  with  m’s  equal  to,  respectively,  mi  and  mi.  W’hen 
the  networks  are  added  together  by  means  of  the  equivalence  of  Fig.  32, 
they  produce  the  single  lattice  shown  on  the  right  of  Fig.  35.  The 
combined  network  is  physically  realizable  when  mi  and  mt  are  conjugate 
complex  quantities  having  positive  real  components.  Since  the  m’s 
corresponding  to  complex  roots  always  satisfy  these  requirements,  we 


336 


H.  W.  BODE 


can  represent  by  means  of  this  structure  all  of  the  complex  roots  called 
for  by  our  general  theory. 

Ladder  Type  Configuration  for  Elementary  Sedione 

As  we  previously  suggested,  this  analysis  is,  in  principle,  merely 
another  version  of  Zobel’s  familiar  work  on  ladder  structures.  The 
adoption  of  the  lattice  configuration  allows  us  to  extend  the  nwleriva- 
tion  to  a  wider  variety  of  m’s  than  is  possible  in  ladder  networks  and 
makes  it  easy  to  assign  any  required  image  impedance  characteristic  to 
the  structure,  but  otherwise  the  two  procedures  are  essentially  the  same. 


Fio.  37.  Equivalence  of  lattice  and  n  networks 


The  relation  between  the  two  can  perhaps  be  brought  out  more  clearly 
by  means  of  the  equivalences  between  lattice  and  T  or  n  structures 
shown  by  Figs.  36  and  37.^  It  will  be  seen  from  the  figures  that  a  T  or  a  11 
can  always  be  replaced  by  a  physically  realizable  lattice  but  that  the 
converse  is  not  true.  For  example,  we  can  convert  a  lattice  structure 
into  a  physically  realisable  T,  using  the  relations  of  Fig.  36,  only  if  it  is 
possible  to  subtract  the  Z,  branch  of  the  lattice  from  the  Zy  branch  with* 
out  leaving  a  non-physical  remainder.  Similarly,  the  conversion  from  a 
lattice  to  a  n  is  possible  only  provided  the  Z,  branch  of  the  lattice  can 

**  These  equivalences  have  been  given  frequently  in  previous  publications. 
See,  for  example,  Zobel,  loc.  cit.,  p.  19. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


337 


be  divided  into  two  physically  realisable  parallel  impedances,  one  of 
which  is  equal  to  the  Z,  branch. 

Whether  or  not  the  two  branches  of  the  lattice  can  be  related  in 
either  of  these  manners  will  depend,  of  course,  upon  the  image  imped¬ 
ance  assigned  to  the  structure.  If  we  consider  only  the  structures  with 
very  simple  image  impedances  shown  by  Fig.  34,  however,  the  problem 
is  easily  investigated.  An  inspection  of  that  figure  shows  that  in  each 
case  except  that  of  the  all-pass  either  the  Zy  branch  of  the  lattice  includes 
a  series  impedance  similar  in  physical  configuration  to  the  impedance  of 
the  Za  branch  or  the  Z«  branch  includes  a  parallel  impedance  similar  to 
the  impedance  of  the  Z,  branch.  With  the  exception  of  the  all-pass 
structure,  therefore,  every  lattice  can  be  converted  to  either  a  T  or  a  n 
provided  the  parameter  m  lies  within  a  range  which  gives  a  suitable 
numerical  relation  between  the  impedances  of  the  two  branches.  In 
the  low-pass  structure  of  Fig.  34a,  for  example,  the  inductances  in  the 
two  branches  are  just  equal  when  m  «  1  and  the  inductance  in  the  Z. 
branch  is  less  than  that  in  the  Z«  branch  when  m  <  1.  We  can  con¬ 
sequently  identify  the  Z,  inductance  with  the  Z.  of  Fig.  36  and  use  the 
equivalence  of  that  figure  to  convert  the  structure  to  a  7  when  m  ^  1. 
If  we  use  the  low-pass  configuration  of  Fig.  34b,  on  the  other  hand,  the 
equivalence  of  Fig.  37  allows  us  to  convert  the  network  to  an  equivalent 
n  for  the  same  range  of  values  of  m. 

The  equivalent  ladder  networks  shown  in  Fig.  38  have  been  obtained 
by  this  method.  Each  network  is  drawn  for  the  special  value  m  i-  1. 
The  low-pass  and  high-pass  structures  are  identical  with  the  usual 
“constant-A;’’  low-pass  and  high-pass  protot3rpes  of  ladder  filter  analysis 
while  the  band-pass  structures  are  the  so-called  “three-element”  sec¬ 
tions*'  of  the  ladder  theory.  For  m’s  less  than  1,  of  course,  the  equiva¬ 
lent  ladder  networks  will  be  simply  m-derivatives  of  these  structures, 
in  all  respects  identical  with  the  m-derived  structures  of  the  usual 
analysis. 

These  results  are  about  what  we  might  expect  from  standard  filter 
theory,  if  we  assume  that  the  m  notation  used  in  describing  the  lattice 
corresponds  to  that  usually  employed.  As  reference  to  the  preceding 
discussion  will  show,  the  m’s  for  which  ladder  equivalents  can  be  ob¬ 
tained  are  those  for  which  the  root  of  the  lattice  equation  tanh  9/2  1 

is  real  and  negative.**  No  conversion  to  the  ladder  form,  with  ordinary 

**  Called  type*  V  and  VI  in  Zobel’s  list. 

**  The  ladder  equivalences  which  we  have  just  described  hold  only  when  the 
image  impedance  of  the  lattice  belongs  to  one  of  the  simplest  types.  By  using 


338 


H.  W.  BODE 


elements,  appears  to  be  possible  in  lattices  with  complex  or  positive 
real  roots.  Positive  real  root  sections  can,  however,  be  built  in  the 
ladder  form  with  the  help  of  mutual  inductance  between  the  coils  of 


Fiq.  38.  Ladder  ronfixurations  for  elementary  symmetriral  constituents  of 
the  general  composite  filter. 


the  section."  Complex  root  sections,  on  the  other  hand,  appear  to  be 
realizable  only  in  the  lattice  form  or  in  some  analogous  bridge  type 


the  more  complicated  ladder  structures  described  in  the  next  section,  however, 
the  conversion  to  the  ladder  form  can  be  made  with  unrestricted  image  impedance 
characteristics. 

**  See,  for  example,  the  list  of  structures  given  by  K.  S.  Johnson  and  T.  E.  Shea 
in  “.Mutual  Inductance  in  Wave  Filters,”  Bell  System  Technical  Journal,  Jan. 
1925. 


J 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


339 


configuration,  such  as  a  bridged-T  or  a  ladder  having  mutual  inductance 
or  other  linkages  between  different  sections. 

PART  VI.  UNSYMMETRICAL  CONSTITUENTS  OF  THE  COMPOSITE  STRUCTURE 

The  general  theorem  on  the  relation  between  the  image  impedances 
and  the  transfer  constant  of  a  reactive  structure  shows  that  if  the  struc¬ 
ture  has  the  same  image  impedance  at  both  ends,  all  of  the  roots  of  the 
equation  tanh  6  ^  \  must  necessarily  be  of  even  multiplicity.  We 
have  been  able  to  show  how  any  possible  system  of  double  roots  can  be 
represented  by  lattices.  We  can,  therefore,  turn  to  the  representation 
of  the  simple  roots,  corresponding  to  the  unsymmetrical  constituents  of 
the  complete  comixwite  structure.  In  the  discussion  of  the  double  root 
constituents  we  found  it  easy  to  adjust  the  image  impedance  of  the  net¬ 
work  to  any  suitable  form  and  our  chief  problem  was  that  of  obtaiiflng 
and  stud3ring  the  various  t3rpes  of  transfer  constants  which  were  neces¬ 
sary.  Since  the  simple  roots  must  necessarily  occur  only  at  real  -fre¬ 
quencies,  on  the  other  hand,  the  variety  of  transfer  constant  character¬ 
istics  to  which  they  may  correspond  is  much  more  limited  than  that 
which  can  be  obtained  from  symmetrical  structures.  There  is  no  place 
here  for  the  complex  m’s  and  the  real  m’s  greater  than  one  which  figured 
so  largely  in  the  analysis  of  the  lattice.  All  of  the  transfer  constant 
characteristics  which  we  may  require  can  be  obtained  from  simple 
ladder  type  half-sections.  So  far  as  this  aspect  of  unsymmetrical 
structure  is  concerned,  therefore,  the  problem  is  an  elementary  one. 
Our  difficulties  lie  in  going  from  the  simple  types  of  image  impedance 
characteristics  obtainable  from  conventional  ladder  structures  to  the 
much  greater  variety  necessary  if  the  networks  are  to  enter  as  constit¬ 
uents  of  a  general  composite  filter. 

We  will  approach  the  construction  of  unsymmetrical  structures  from 
two  different  points  of  view.  The  first  of  these  is  suggested  by  our 
anal}rsis  of  the  s}rmmetrical  lattice.  It  amounts  essentially  to  the  con¬ 
version  of  a  double  root  lattice  structure  having  the  required  image 
impedance  and  twice  the  required  transfer  constant  into  an  equivalent 
form  consisting  of  two  similar  halves,  each  of  which  is  a  ladder  structure. 
Each  half,  then,  will  be  the  desired  network  representation  of  the  single 
root. 

This  method  is  simple  and  direct,  but  it  suffers  from  the  disadvantage 
that  it  does  not  lead  to  simple  explicit  formulae  for  the  network  ele¬ 
ments.  Moreover,  it  does  not  indicate  clearly  the  variety  of  configura¬ 
tions  which  can  actually  be  obtained.  We  shall,  therefore,  also  present 


340 


H.  W.  BODE 


a  second  process  which  starts  with  a  ladder  type  half -section  of  a  familiar 
sort  and  obtains  from  it  structures  having  the  same  transfer  constant 
but  a  wider  variety  of  image  impedances  by  a  series  of  simple  trans¬ 
formations.  The  two  procedures  amount  essentially  to  the  solution  of 
the  same  problem  from  two  different  directions,  since  the  first  is  equiva¬ 
lent  to  breaking  down  the  lattice  into  the  form  of  a  ladder  structure, 
while  the  second  can  be  regarded  as  the  construction  of  the  lattice  from 
the  ladder  by  direct  synthesis. 

The  second  method  is  itself  a  combination  of  two  other  methods, 
which  differ  in  the  frequency  ranges  in  which  they  are  competent  to 
introduce  impedance  controlling  factors.  In  a  low-pass  filter,  for 
example,  the  first  of  these  sub-methods  introduces  impedance  controlling 
factors  only  beyond  the  frequency  at  which  tanh  d  ^  1,  while  the  second 
sutf-method  introduces  impedance  controlling  factors  only  between  the 
cut-off  and  this  frequency.  Both  must,  therefore,  be  used  if  any  desired 
set  of  impedance  controlling  factors  is  to  be  obtained.  The  first  sub¬ 
method  is  particularly  interesting  because  it  closely  resembles  the  mul¬ 
tiple  m-derivation  recently  described  by  O.  J.  Zobel.^  Since  Zobel’s 
analysis  is  not  well  suited  for  our  purposes,  however,  the  structures  he 
finds  will  be  treated  in  a  different  fashion  in  the  work  which  follows. 
The  new  treatment,  when  taken  in  conjunction  with  the  rest  of  our 
theory,  is  considerably  simpler  than  Zobel’s  procedure  and,  in  the  case  of 
certain  band-pass  filters,  it  also  leads  to  somewhat  more  general  results. 

Bisection  of  the  Lattice 

Our  first  method  of  constructing  unsymmetrical  filter  sections  depends 
upon  the  transformations  shown  by  Figs.  30  and  40.  They  represent 
the  fact  that  any  impedance  which  appears  in  series  or  in  shunt  with  both 
lattice  branches  can  b<<  taken  out  and  placed  in  series  with  the  lattice 
as  a  whole.  Each  equivalence  can  be  proved  by  converting  the  original 
lattice  to  a  T  or  a  II,  using  the  relations  of  Figs.  36  and  37,  removing  the 
impedances  which  appear  in  series  or  shunt  with  both  of  the  original 
lattice  branches,  and  then  transforming  the  remainder  of  the  network 
back  to  the  lattice  form.  We  shall  suppose  that  the  lattice  to  which  the 
transformations  of  Figs.  39  and  40  are  to  be  applied  is  one  having  a 
double  root  at  the  real  frequency  at  which  the  simple  root  defining  the 
desired  unsymmetrical  structure  occurs.  Tanh  0/2  for  the  lattice,  then, 
will  be  the  same  as  tanh  6  for  the  un83rmmetrical  structure.^  We  shall 

**  Bell  System  Technical  Journal,  Apr.  1931. 

*  By  interchanging  the  branches  of  the  lattice,  we  can,  of  course,  find  two 
expressions  for  the  function  tanh  S/2,  one  of  them  being  the  reciprocal  of  the 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


341 


also  suppose  that  the  image  impedance  of  the  lattice  is  the  same  as  that 
found  at  one  end  of  the  unsymmetrical  structure.  The  second  image 
impedance  of  the  unsymmetrical  structure  can,  of  course,  be  obtained 
by  the  general  theorem  on  the  relation  between  the  image  impedances 
and  the  roots,  and  need  not  be  considered  explicitly. 

I^This  method  of  constructing  unsymmetrical  structures  depends  upon 
the  fact  that  when  the  lattice  with  which  we  start  is  of  the  type  described, 
it  is  possible  to  apply  the  transformations  shown  in  Figs.  39  and  40 
alternately  until  the  structure  is  completely  developed  into  the  form 
of  a  ladder  network.  For  example,  if  we  start  by  representing  common 


Fio.  39.  Lattice  structure  with  developed  series  impedances 


Fio.  40.  Lattice  structure  with  developed  shunt  impedances 


series  impedances  in  both  branches  by  series  imp>edances  external  to  the 
structure  as  a  whole,  using  the  equivalence  of  Fig.  39,  it  will  be  found  that 
the  branches  of  the  reduced  lattice  will  always  contain  common  parallel 
impedances.  We  can,  therefore,  apply  the  equivalence  of  Fig.  40  to 
convert  the  structure  to  a  form  in  which  both  series  and  shunt  imped- 

other.  There  will,  correspondingly,  be  two  reciprocal  expressions  for  tanh  9in  the 
unsymmetrical  structure.  As  our  discussion  of  the  theorem  on  the  determination 
of  the  transfer  constant  by  the  roots  showed,  however,  only  that  one  of  these 
expressions  is  physical  which  leads  to  values  of  tanh  9  <  1  at  positive  real  values  of 
X.  It  is  important,  therefore,  to  begin  with  the  lattice  which  meets  this  condition 
rather  than  with  the  lattice  which  gives  the  reciprocal  function. 


342 


H.  W.  BODE 


ances  appear  externally.  As  this  process  is  repeated  the  impedances 
of  the  lattice  branches  become  simpler  and  simpler  and  Bnally  disappear. 
In  the  end,  therefore,  we  obtain  merely  a  ladder  structure  composed  of 
two  symmetrical  halves.  It  is  evident  from  symmetry  that  each  half 
will  have  one  image  impedance  identical  with  that  of  the  original  lattice 
and  a  transfer  constant  which  is  half  that  of  the  original  lattice.  Each 
half,  therefore,  is  the  unsymmetrical  structure  which  we  seek.  By 
varying  the  choice  of  the  series  or  shunt  impedance  to  be  placed  outside 


Fio.  41.  Frequenry  pattern  for  low-paaa  structure 


Fio.  42 

the  structure  as  a  whole  at  each  step,  we  can  secure  a  wide  variety  of 
]X)ssible  networks. 

The  fact  that  this  process  is  alwa3r8  possible  can  best  be  shown  with 
the  aid  of  an  example.  Let  us  assume  that  we  are  attempting  to  con¬ 
struct  a  low-pass  filter  having  a  simple  root  at  /  and  one  image 
impedance  characteristic  given  by 

17  rj 

ht  “  - r - 

aiojh 


(46) 


344 


H.  W.  BODE 


where  a  represents  the  cut-off  and  the  other  a’s  are  used  to  represent 
impedance  controlling  factors  between  the  cut-off  and  while  the  b’s 
represent  impedance  controlling  factors  beyond  /..  The  corresponding 
frequency  pattern  then  must  be  that  shown  by  Fig.  41.  This  will  also 
be  the  frequency  pattern  of  the  double  root  lattice  with  which  we  start 
if  we  identify  Z«c  with  Z,  and  Zoc  with  Z^.  Tanh  0  for  the  unsym- 
metrical  structure,  or  tanh  0/2  for  the  lattice,  will  be  given  by  an  expres¬ 
sion  of  the  type  ikf/y/^,  and  must  evidently  follow  the  general  course 
shown  by  Fig.  42  in  the  range  beyond  the  cut-off. 

For  the  moment  we  will  neglect  the  presence  of  the  b  factors.  The 
branches  of  the  lattice  then  can  be  represented  as  shown  by  Fig.  43. 
The  anti-resonant  networks  identified  by  oi  and  at  in  this  figure  are 
evidently  common  series  impedances  which  by  means  of  the  equivalence 
of  Fig.  39  can  be  put  in  series  with  the  structure  as  a  whole.  It  is  clear 
from  the  sketch  of  tanh  0  given  by  Fig.  42,  however,  that  these  anti¬ 
resonances  occur  at  frequencies  at  which  the  ratio  of  Z.  to  Zy  is  greater 
than  one.  The  residue  at  the  pole  of  impedance  produced  by  one  of  the 
anti-resonances  of  the  Z,  branch  must  consequently  be  greater  than  the 
residue  at  the  corresponding  pole  in  the  Zy  branch.^  When  we  apply  the 
equivalence  of  Fig.  39,  therefore,  the  anti-resonance  meshes  in  Zy  can 
be  removed  entirely  while  positive  elements  are  still  found  in  the  Z. 
branch.  After  this  operation  the  network  assumes  the  form  shown  by 
Fig.  44.  Now  in  the  reduced  lattice,  the  series  branch  can  be  trans¬ 
formed  by  standard  network  equivalences  into  a  form  in  which  it  appears 
as  a  capacity  in  parallel  with  another  physically  realizable  impedance. 
At  very  high  frequencies,  the  impedance  of  the  branch  will  depend  only 
upon  that  capacity.  Moreover,  since  Zy  >  Z,  at  very  high  frequencies, 
from  Fig.  42,  this  capacity  will  be  larger  than  the  capacity  in  the  Zy 
branch  of  the  reduced*  lattice.  We  can  consequently  apply  the  equiva¬ 
lence  of  Fig.  40  to  reduce  the  structure  to  the  form  shown  by  Fig.  45. 

This  last  form  is  a  ladder  network  representation  of  the  original 
filter  except  that  we  have  still  to  include  the  b  factors. ,  It  is  clear, 
however,  that  if  we  multiply  each  branch  of  any  network  by  any  func¬ 
tion  of  frequency,  the  image  impedance  of  the  network  will  be  multiplied 
by  that  function  of  frequency  while  its  transfer  constant  will  be  un¬ 
changed.  We  can,  therefore,  introduce  these  factors  merely  by  multi¬ 
plying  the  impedance  of  each  branch  of  the  network  of  Fig.  45,  expressed 

**  That  is,  while  the  product  LC,  which  determines  the  frequency  of  anti-reso¬ 
nance,  is  the  same  for  the  two  meshes,  the  ratio  L/C  is  greater  for  an  anti-resonant 
mesh  in  Z.  than  for  the  corresponding  anti-resonant  mesh  in  Zy. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


345 


in  the  Foster’s  theorem  forms  shown  in  that  figure,  by  The 

ladder  structure  after  this  transformation  is  shown  by  Fig.  46.  Either 
half  of  this  network  is,  of  course,  the  unsymmetrical  structure  which  is 
required. 

The  process  we  have  just  described  applies  when  the  last  impedance 
controlling  frequency  before  the  peak  of  attenuation  is  an  anti-resonance. 
If  we  choose  an  image  impedance  expression  in  which  this  last  frequency 
is  a  resonance  it  is  desirable  to  begin  by  first  removing  common  imped¬ 
ances  in  parallel  with  both  branches.  Suppose,  for  example,  that  the 
image  impedance  which  is  required  is 

Z,  «  (47) 

where  the  a’s  and  &’s  have  their  previous  significance.  If,  as  before,  we 
neglect  the  b  factors  temporarily,  the  corresponding  lattice  can  be  built 
in  the  form  shown  by  Fig.  47.  Each  branch  has  a  resonant  circuit 
resonating  at  the  common  frequency  a«.  The  residue  of  the  admittance 
of  the  Z,  branch  at  this  frequency  is,  however,  less  than  that  of  the  Z, 
branch  because  the  resonant  frequency  occurs  in  the  range  where 
Z.  >  Zy.  If  we  make  use  of  the  transformation  of  Fig.  40,  therefore, 
the  structure  can  be  reduced  to  the  form  shown  by  Fig.  48.  The  Z, 
branch  of  the  new  lattice  can  now  be  converted  to  an  equivalent  form  in 
which  it  consists  of  an  inductance  in  series  with  another  network,  and 
since  Z,  >  Z.  at  high  frequencies,  this  inductance  will  be  larger  than  the 
inductance  in  the  Z,  branch  of  the  lattice  of  Fig.  48.  With  the  help  of 
the  transformation  in  Fig.  39,  therefore,  the  network  can  be  converted 
to  the  form  shown  by  Fig.  49.  Introducing  the  b’a,  then,  gives  us  the 
final  structure  shown  by  Fig.  50. 

The  same  general  procedure  can  also  be  used  for  filters  of  other  types. 
In  each  case,  if  we  follow  exactly  the  plan  which  has  been  outlined,  the 
unsymmetrical  structure  which  results  will  be  a  T  or  a  n.  It  is  evident, 
however,  that  the  process  can  be  varied  in  many  ways,  so  that  many 
other  configurations  are  {X)esiblc.  For  example,  in  going  from  the 
configuration  of  Fig.  43  to  that  of  Fig.  44,  we  might  have  removed  only 
one  instead  of  both  of  the  anti-resonant  meshes  common  to  the  two  lattice 
impedances.  Since  each  of  the  branches  in  the  lattice  of  Fig.  43  can  be 

”  It  is  eMy  to  show  that  the  resonances  and  anti-resonances  of  each  of  the 
branches  of  Fig.  45  before  the  multiplication  by  the  b’s  lie  within  the  range  be¬ 
tween  ft  and  /.  so  that  the  introduction  of  these  factors  does  not  violate  the  con¬ 
dition  that  resonances  and  anti-resonances  must  alternate. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


347 


represented  as  a  capacity  in  parallel  with  another  impedance  we  might 
also  have  begim  by  placing  capacities  in  parallel  with  the  structure  as  a 
whole.  These  various  alternative  configurations  can  best  be  obtained 
by  means  of  the  method  to  be  described  in  the  following  section.  This 
method  has  the  additional  advantage  that  it  leads  to  explicit  formulae 
for  the  network  elements  in  terms  of  the  desired  impedance  controlling 
frequencies. 

TransformcUiona  of  Ladder  Type  Half-Sections 

Our  second  method  of  constructing  the  unsymmetrical  constituents 
of  the  complete  filter  obtains  them  from  ordinary  ladder  type  half¬ 
sections  by  direct  transformations.  We  will  represent  the  original  half- 


Fia.  51.  The  prototype  halLaection 


section  by  the  configuration  of  Fig.  51 .  Its  transfer  constant  and  image 
impedances  will  then  be  given  by  the  familiar  expressions 


tanhO, 

(48) 

m  \/ Z«Z»  “v/ 1  -J-  r , 

(49) 

17  y/  ZmZh 

-  vr+-r’ 

(50) 

where  the  subscript  p  stands  for  prototype  and  r  represents  the  ratio 
Z./Zk.  The  half-eection  is  supposed  to  be  half  of  any  section  m-derived 
from  the  ladder  structures  of  Fig.  34.  For  each  class  of  filter,  therefore, 
the  half-section  of  Fig.  51  may  represent  any  one  of  several  possible 
configurations.  If  we  examine  the  image  impedances  of  the  haK- 
section  m-derivatives  of  all  of  the  structures  shown  by  Fig.  34,  for  each 
filter  class,  we  find  that  among  them  they  include  all  possible  arrange¬ 
ments  of  the  cut-off  frequencies  and  of  the  impedance  controlling  fre¬ 
quency  representing  the  peak  of  attenuation  with  reference  to  the  cut¬ 
offs.  We  can ,  therefore,  assume  once  for  all  that  the  adjustment  of  these 


348 


H.  W,  BODE 


portions  of  the  final  image  impedance  expressions  presents  no  difficulty. 
The  problem  which  remains  to  be  solved  is  that  of  associating  with  the 
outr-off  factors  and  the  peak  factor  any  possible  other  set  of  impedance 
controlling  frequencies.  The  new  impedance  controlling  factors  will 
occasionally  be  dealt  with  as  though  they  appeared  in  the  conventional 
form,  1  —/*//^, and  occasionally  as  though  theyappearedas  1+  kr,  where 
r  is  the  frequency  variable  Z./Z^  introduced  in  equations  (48)  to  (50). 
Which  method  we  shall  use  is  merely  a  matter  of  convenience.  As  r 
will  in  ail  cases  a  first  degree  rational  function  of  P,  we  can  go  from 
one  expression  to  the  other  without  difficulty. 

SubtiUiUum  Method  of  Introducing  Impedance  Controlling  Factors 

Our  first  method  of  adding  impedance  controlling  factors  to  ladder 
t}rpe  half-eections  has  already  been  suggested.  It  rests  upon  the  simple 
observation  that  the  expression  for  the  transfer  constant  given  by 


Fio.  52.  Low-pass  half-section  before  the  application  of  the  substitution  process 

equation  (48)  depends  only  upon  the  ratio  r  of  the  impedances  Z.  and 
Zk.  The  image  impedance  expressions  given  by  (49)  and  (50),  on  the 
other  hand,  involve  also  the  product  Z.Zk  of  the  two  impedances. 
Exactly  as  in  our  previous  analysis,  then,  new  impedance  controlling 
factors  can  be  added,  without  affecting  the  transfer  constant,  by  multi¬ 
plying  Z.  and  Z»  by  suitable  expressions.  There  is,  however,  one 
important  limitation  on  this  process  which  has  not  appeared  previously. 
Obviously,  we  can  multiply  Z«  and  Z»  by  new  resonant  and  anti-resonant 
factors,  representing  coincident  leros  or  coincident  poles  in  the  two 
impedances,  only  at  frequencies  where  they  are  reactances  of  the  same 
sign.  This  was  no  restriction  in  the  lattice  analysis,  since  the  network 
attenuated  any  frequency  at  which  the  ratio  Z,/Z»  was  positive.  The 
introduction  of  coincident  leros  or  coincident  poles  was,  therefore, 
possible  in  the  lattice  in  any  part  of  the  attenuating  band.  The  ratio  of 
Z.  to  Zk  in  all  of  the  prototype  ladder  structures,  however,  is  negative 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


349 


w 


between  the  transmitting  band  and  the  peak  of  attenuation.  The 
substitution  of  new  impedances  with  additional  zeros  and  poles  for  the 
original  Z.  and  Z»  can  be  used  to  introduce  impedance  controlling 
factors  at  any  other  frequency  but  it  will  be  of  no  use  in  this  range.  In 
order  to  supply  impedance  controlling  factors  at  negative  values  of  r, 
we  must  make  use  of  a  different  process  to  be  described  later. 

As  an  example  of  the  substitution  process,  we  will  assume  that  the 


prototyp 

Zs 

Zb 

>e  half-se 

ction  is  the  structure  of  Fig 

i _ X _ i 

.  52,  which  is  obtained  by  a 

1  1  X 

i  r 

'  1 

!  1 

1 

!  1 

1  1 

r 

1 

1 

! 

•  I  I  •  • 

*2  I  *3 


Fio.  53.  Frequency  pattern  of  the  structure  of  Fig.  52 


Fig.  54.  Structure  of  Fig.  52  after  application  of  the  substitution  process 

mid-shunt  m-derivation  of  the  structure  of  Fig.  38e.  Its  image  imped¬ 
ances  are 

(51) 

V  Or, 

where  0^  and  Oc,  represent  the  cut-offs  and  a.  the  peak  of  attenuation. 
The  frequency  pattern  of  Z.  and  Zt  is  given  by  the  solid  line  portion  of 
Fig.  53.  In  this  filter,  r  is  negative  between  the  upper  cut-off  and  the 
peak  of  attenuation .  The  substitution  process,  therefore,  does  not  allow 
us  to  introduce  impedance  controlling  factors  into  this  range.  We  can, 
however,  introduce  factors  into  any  other  part  of  the  attenuating  band. 


350 


H.  W.  BODE 


One  possible  set  of  factors  is  shown  by  the  broken  lines  of  Fig.  53. 
new  network  is  shown  by  Fig.  54.  Its  image  impedances  are 


^/i 


Oi  '\/o7i  y/a^  at 


Ota, 


Ol  Oi 

at 


The 


(52) 


Superficially,  this  process  appears  to  bear  no  resemblance  to  Zobel’s 
multiple  m-derived  method*  of  varying  the  impedance  characteristics 
of  ladder  type  half-sections.  Both  methods,  however,  introduce  imped¬ 
ance  controlling  factors  in  the  same  general  frequency  ranges  and  both 
lead  to  networks  containing  only  two  branches.  Now,  when  a  network 
contains  only  two  branches,  both  are  fixed  when  the  image  impedance 
and  transfer  constant  have  been  specified,  and  we  must  conclude  that 
the  two  methods  lead  to  substantially  the  same  physical  networks. 
For  low-pass  and  high-pass  structures  the  physical  results  are,  in  fact, 
identical  and  the  substitution  process  appears  to  be  recommended 
chiefly  by  its  greater  simplicity  and  directness.  For  band-pass  struc¬ 
tures  it  has  also  some  advantage  of  generality,  since  the  multiple  m- 
derivation  offers  no  method  of  introducing  impedance  controlling 
factors  on  the  side  of  the  band  opposite  the  peak  of  attenuation.** 


H-Derived  Networks 

Our  method  of  introducing  impedance  controlling  factors  in  ranges 
where  r  is  positive  depends  upon  the  substitution  of  more  complicated 
networks  for  the  Z,  and  Z»  impedances  of  the  prototype.  In  order  to 
obtain  impedance  controlling  factors  when  r  is  negative,  on  the  other 
hand,  we  shall  increase  the  number  of  branches  of  the  network,  leaving 
Za  and  Zk  substantially  unaltered.  The  new  series  branches  consist  of 
constant  multiples  of  the  short-circuit  impedance  of  the  network  and 
the  new  shunt  braneh'es  of  multiples  of  its  open-circuit  impedance. 
Each  branch  introduces  one  factor  into  the  image  impedance  expres¬ 
sions.  By  adding  a  number  of  branches  in  succession,  we  can  build 
up,  step-by-step,  the  desired  combination  of  factors  in  the  range  where  r 
is  negative. 


«  Loe.cit. 

**  It  must  be  remembered,  however,  that  the  band-pass  structures  to  which  this 
statement  applies  are  of  the  type  having  an  unsymmetrical  attenuation  char¬ 
acteristic,  with  a  peak  of  attenuation  on  only  one  side  of  the  transmission  band. 
The  band-pass  structures  actually  considered  by  Zobel  however  were  m-derivatives 
of  the  constant-k  prototype  and  had  symmetrical  attenuation  characteristics. 
For  these  structures  the  two  methods  lead  to  identical  results. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


351 


The  method  will  be  most  easily  understood  if  we  assume  that  \Zsc,  a 
constant  multiple  of  the  short-circuit  impedance  of  the  network,  is 
added  in  series  with  the  mid-shunt  terminals  of  the  structure  of  Fi^.  51. 
We  can  suppose  that  Fig.  51  represents  either  one  of  the  original  proto¬ 
type  half-sections  or  a  structure  derived  from  them  by  the  substitution 
process  we  have  just  described.  The  new  structure  will  appear  in  the 


XZb 


Zb 


Fig.  55 


form  shown  by  Fig.  55.  Its  open-  and  short-circuit  impedances,  as  seen 
from  the  1-2  terminals,  afe 


Zbc  «  (1  x) 


Z.Z, 


and 


Zoc  *  Zt 


Za  "b  Zi 
(1  Za  Zh 


(53) 


Za  Zt 


(54) 


from  which 


tanh  8' 


(1  -I-  X)r 


r  1 -H  (1 -b  X)r- 


(55) 


This  is  not  the  same  as  equation  (48),  so  that  the  transformation  thus 
far  has  not  succeeded  in  preserving  the  original  transfer  constant.  We 
can  regain  this  transfer  constant,  however,  by  making  a  further  change 
in  the  network.  It  is  merely  necessary  to  replace  Z,  and  Z»  by  new 
values,  Z'  and  Z^,  such  that  their  new  ratio  r'  satisfies  the  equation 


1 


(56) 


1  +  X 

since,  obviously,  the  substitution  of  K  for  r  in  (55)  leads  us  back  to  the 
original  transfer  constant  equation  (48). 


362 


H.  W.  BODE 


We  shall  satisfy  equation  (56)  by  replacing  each  Z.  impedance  by 
1/1  -f  X  times  its  {Mvvious  value.  If,  in  addition,  we  replace  X  by  a  new 
parameter  h,  where  h  —  1/1  +  X,  the  structure  of  Fig.  55  is  reduced  to  the 
form  shown  by  Fig.  56.  The  new  network  has  the  same  transfer  con¬ 
stant  as  the  network  of  Fig.  51  of  course,  but  its  image  impedances,  as 
found  from  open-  and  short-circuit  computations,  are  given  by  the 
expressions, 

y  V Z,Z»  Vl  +  r  /eyx 

* - iThr 

and 


„  Vza,  (1  +  hr) 

it/,  « - T— - 

\/l  +r 


i 


Fia.  56.  Configuration  of  first  order  ^-derived  network 


The  quantities  y/ ZJZk  v^l  -f  r  and  y/ Z,Z*/\/ 1  +  f  ar®,  of  course,  the 
prototype  image  impeidances  of  equations  (49)  and  (50).  Equations 
(57)  and  (58)  can,  therefore,  be  rewritten  in  the  somewhat  simpler  forms 

z„  -  ■  («•) 

Z/,  -  (1  -I-  hr)  .  (60) 

An  inspection  of  these  equations  shows  that  the  transformation  has 
made  two  changes  in  the  image  impedances  of  the  protot3rpe.  An 
obvious  change  is  the  introduction  of  the  new  factor  1  -|-  5r  into  both 
Impedance  expressions.  In  addition,  we  may  observe  that  the  mid¬ 
series  impedance  Zr,  occurs  in  Z/„  which  is  calculated  from  what  were 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


353 


formerly  the  mid-ehunt  terminals  of  the  structure,  and  vice  versa.  The 
transformation  has,  therefore,  reversed  the  mid-series**  and  mid-shunt 
ends  of  the  network. 

This  transformation,  preserving  the  transfer  constant  but  changing 
the  image  impedance,  is  in  a  sense  the  converse  of  the  m-derivation, 
which  preserves  the  inmge  impedance  of  the  prototype  but  produces  a 
different  transfer  constant.  The  parallelism  between  the  two  is  en¬ 
hanced  by  the  observation  that  physically  possible  values  of  the  multi¬ 
plier  X  correspond  to  values  of  h  between  sero  and  one.  Our  parameter 
h  is  thus  confined  within  the  same  limits  as  the  parameter  m  in  a  physi¬ 
cally  realisable  m-derived  ladder  network.  In  each  case  also  the  value 
unity  corresponds  to  an  identical  transformation .  The  process  by  which 
Fig.  56  is  obtained  from  Fig.  51  will  be  called  an  A-derivation  because 
of  this  analogy. 


Fio.  67 


AUemcUive  Configurations  for  h-Derived  Networks 

An  alternative  form  of  the  derivation  is  based  upon  the  addition  of  a 
multiple  of  the  open-circuit  impedance  in  parallel.  The  open-circuit 
impedance  must  be  determined  from  the  mid-series  end  of  the  network 
and  should  be  added  at  that  end.  If  we  let  1/X  be  the  constant  multiplier 
of  the  added  impedance,  the  new  network  will  have  the  form  shown  by 
Fig.  57.  Its  transfer  constant,  as  determined  from  open-  and  short-cir¬ 
cuit  impedance  calculations,  is  then  that  given  by  equation  (55)  so  that 
if  we  are  to  preserve  the  original  transfer  constant,  we  must,  as  before, 

**  Since  the  structure  of  Fig.  56  terminates  in  a  series  branch  at  both  ends  we 
cannot  use  the  expressions  *‘mid-series”  and  “mid-shunt”  in  the  physical  sense 
possible  for  more  familiar  ladder  type  half-sections.  We  shall  hereafter  use  “mid¬ 
series”  to  denote  the  end  of  the  network  at  which  the  factor  -f-r  appears  in 
the  numerator  of  the  image  impedance,  and  vice  versa. 


354 


H.  W.  BODE 


make  the  further  modification  in  the  network  expressed  by  equation 
(56).  In  the  present  case,  equation  (56)  is  satisfied  by  replacing  each 


Fia.  58.  Possible  alternative  configuration  for  first  order  ^'derived  network 


Fio,  50 


0*bi)Za 


Fio.  60.  Second  possible  alternative  configuration  for  first  order  k-derived 

network 

Z»  impedance  by  1  -|-  X  times  its  previous  value.  If  we  also  replace  X 
by  h,  where  again  A  »  1/1  -|-  X,  the  resulting  network  \till  be  that  shown 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


355 


by  Fig.  58.  Its  image  impedances  are  those  given  by  equations  (57)  and 
(58),  so  that  the  networks  of  Figs.  56  and  58  are  externally  equivalent. 

It  is  also  possible  to  add  a  multiple  of  the  short-circuit  impedance 
at  one  end  and  a  multiple  of  the  open-circuit  impedance  at  the  other  end 
simultaneously.  If  we  let  Xi  and  1/Xt  be  the  constant  multipliers  of  the 
two  imi)edance8,  the  resulting  configuration  will  be  that  shown  by 
Fig.  50.  Replacing  Z.  and  Zh  by  impedances  which  are  respec¬ 
tively  1/1  -{-  Xi  and  1  Xi  times  as  great,  we  obtain  the  network  of 


Fio.  61.  H-derivation  of  a  mid-shunt  m-derived  low-pass  half-section 


Fio.  62.  Alternative  configuration  for  Fig.  61 


Fig.  60,  where,  as  before,  hi  and  ht  represent  respectively  1/1  -f  Xt  and 
1/1  +  Xt.  Its  transfer  constant  and  image  impedance  expressions  are 
the  same  as  those  for  the  networks  of  Figs.  56  and  58  except  that  the 
product  hihf  replaces  h  in  both  image  impedances.  The  simultaneous 
addition  of  both  impedances,  therefore,  gives  us  greater  flexibility  in  the 
resulting  network  but  leads  to  no  new  external  characteristics. 

As  an  example  of  the  derivation,  we  will  assume  that  the  prototype 
half-section  is  of  the  mid-shunt  m-derived  low-pass  type.  The  trans¬ 
formation  of  Fig.  56  leads  to  the  structure  of  Fig.  61  and  that  of  Fig.  58 


356 


H.  W.  BODE 


to  the  structure  of  Fig.  62.  It  is  easily  shown  that  the  ratio  r  for  an 
m-derived  low-pass  structure  can  be  written  as 


r 


1  _  (1  _  m*) 


s\ 


(61) 


Since  Z',^  is  of  the  constant-ib  t3rpe  the  second  image  impedance  given 
by  the  general  formula  (60)  can  be  written  immediately  as 


1  -  (1  -  (1  -  A)m*]^ 

_ /  € 


(62) 


where  the  factor  1  —  (1  —  m*)P/f\,  of  course,  vanishes  at  the  peak  of 
attenuation.  The  prototype  mid-series  impedance,  Z/,,  however, 
already  contains  a  peak  factor,  which  will  cancel  with  1  —  (1  —  m*)/*//J 
when  r  is  replaced  by  (61)  in  equation  (59).  The  Brst  image  impedance 
consequently  becomes 


2i, 


1  -  11  -  (1  -  A)m*)^ 

/  • 


(63) 


These  two  A-derived  impedances  are  the  same  as  the  image  imped¬ 
ances  which  would  be  found  in  a  network  obtained  by  the  double 
m-derivation  or  substitution  process  except  that  the  common  factor 
1  —  [1  —  (1  —  h)m*]p/fl  appearing  in  both  expressions  vanishes  between 
the  cutroff  and  the  peak  of  attenuation  and  not  beyond  the  peak,  as  it 
would  in  a  doubly  m-derived  structure.  It  will  be  noticed  that  the  peak 
factor  originally  occurred  in  the  mid-series  image  impedance.  The 
transformation  has,  therefore,  transferred  it  from  the  mid-series  to  the 
mid-shunt  characteristic.  Since  the  process  also  reverses  the  mid-series 
and  mvd-shunt  terminations,  however,  the  peak  factor  is  still  found  at 
the  same  end  of  the  network,  physically,  as  it  was  before.  In  order  to 
introduce  the  peak  factor  into  the  mid-series  impiedance  of  the  trans¬ 
formed  network  it  would  be  necessary  to  begin  with  a  mid-series  m- 
derived  section,  giving  rise  (if  we  use  the  transformation  of  Fig.  56)  to 
the  structure  shown  by  Fig.  63. 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS  357 

Similar  results  are  obtained  from  other  types  of  filter  structures, 
including  those  in  which  impedance  controlling  factors  have  already 
been  added  by  the  substitution  method  described  in  an  earlier  section. 
As  a  second  example  we  will  take  the  band-pass  network  of  Fig.  54  which 
we  previously  used  to  illustrate  the  substitution  process.  Since  the 
resonant  frequencies  of  the  Zi  impedance  in  that  figure  are  aKso  resonant 
frequencies  of  the  Z«  impedance,  the  parallel  combination  of  Z.  and  Zi, 
impedances  which  forms  the  new  series  arm  introduced  by  the  A-deriva- 


Fia.  63.  //-derivstion  of  a  mid-series  m-derived  low-pass  half-seetion 


Fia.  64.  H-derivation  of  a  band-pass  structure 


tion  will  have  the  same  general  physical  configuration  as  the  Z«  imped¬ 
ance  alone.  The  A-derived  network  can  consequently  be  represented 
by  the  structure  of  Fig.  64.  In  determining  the  image  impedances  of 
this  structure  we  may  first  observe  that  since  the  protot}rpe  image 
impedances  Z/p  and  Z/p  form  a  part  of  the  general  formulae  (59)  and 
(60)  all  of  the  impedance  controlling  factors  previously  added  by  the 
substitution  process  will  appear  unchanged  after  the  network  is  h- 
derived.  The  A-derivation  will,  however,  add  a  new  factor  vanishing 


358 


H.  W.  BODE 


between  the  upper  cut-off  and  the  peak  of  attenuation  and  it  will  also 
produce  the  interchanges  in  the  locations  of  the  peak  factor  and  of  the 
mid-shunt  and  mid-series  impedances  which  we  have  already  noticed 
in  our  discussion  of  low-pass  filters.  The  image  impedance  expressions 
we  found  for  the  network  of  Fig.  54  can,  therefore,  be  replaced  by 


2/, 

2/, 


0|  VOri  V^O»1 

(hOk 

oi  y/^x 
Ol  Vorj  <>• 


(64) 

(65) 


where  a*  represents  the  new  factor  introduced  by  the  transformation. 


I! -Derived  Networks  of  Higher  Orders 

The  process  we  have  just  described  allows  us  to  introduce  just  one 
impedance  controlling  factor  in  the  range  where  r  is  negative.  By 
continuing  step-by-step  in  the  proper  fashion,  however,  we  can  introduce 
any  other  necessary  factors  into  this  range.  At  each  stage,  we  may  add 
either  a  multiple  of  the  short-circuit  impedance  in  series  at  the  mid-shunt 
terminals  of  the  network  or  a  multiple  of  the  open-circuit  impedance  in 
parallel  at  the  mid-series  terminals.  The  added  impedances,  of  course, 
must  be  determined  from  the  end  of  the  network  to  which  they  are  to  be 
connected,  and  from  the  network  to  which  they  are  to  be  applied  and  not 
from  the  original  prototype.  As  in  the  first  derivation,  either  the  Z.  or 
Zk  impedances  must  be  multiplied  or  divided  by  1  -f-  X  after  each  such 
impedance  mesh  is  added.  After  n  transformations,  the  image  im¬ 
pedance  will  contain  n  rational  factors  of  the  form  1  -|-  C/r.  The  c’s  are 
obtained  by  starting  with  the  product  of  all  of  the  h's  and  dividing 
successively  by  the  h’»  of  increasing  order.  They,  therefore,  appear  as 
hihtht  •  •  •  hn’,  h^t . . .  A„; . . .  ;  hn-ihn’,  5„. 

Since  the  characteristics  of  the  transformations  of  the  second  and 
hifdicr  orders  are  exactly  similar  to  those  of  the  first  order  derivation, 
they  will  not  be  considered  here.  We  will  turn  instead  to  the  general 
proof  that  the  transformation  is  always  possible.  The  proof  is  most 
readily  obtained  by  induction.  We  will  suppose  for  concreteness  that 
we  have  been  able  to  find  a  third  order  derived  network.  Its  transfer 
constant,  then,  must  be  that  given  by  equation  (48)  and  it  must  be  made 
up  of  some  combination  of  positive  Z.  and  Zk  impedances.  One  image 
impedance  will  be  of  the  form  I 

z„  -  via.  *'*;*■[> .  (66) 

V  1  -|-  r  (1  -f  AiAir) 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


359 


The  second  inuige  impedance  is  immaterial  for  our  present  purposes. 
It  can  be  shown  from  the  general  theorem  on  the  relation  between  the 
transfer  constant  and  the  image  impedances,  however,  that  it  must  be 
given,  except  for  a  possible  constant  multiplier,  by  the  inverse  of  the 
first  image  impedance  with  respect  to  the  product  Z,Zt.  If  we  neglect 
this  possible  multiplier,  therefore,  it  can  be  written  as 

Z/,  - 

(1  +  A*r)(l  +  AiMir) 

We  wish  to  transform  this  structure  into  one  containing  a  fourth 
impedance  controlling  factor.  At  the  first  end  of  the  network  the 
open-  and  short-circuit  impedances,  as  determined  from  the  equations 
Zgc  m  Z,  tanh  6  and  Zgc  *  Z//tanh  6 ,  are 

y  7  (I  "H  ^*0(1  4*  hihthtr)  ,  . 

--  (l  +rKi  +  Wr  -  '  • 

Zoc  »  Z,  .  (69) 

Since  the  image  impedance  at  this  end  is  of  the  mid-shunt  type  we  must 
add  a  short-circuit  impedance  in  series  rather  than  an  open-circuit 
impedance  in  shunt.  Upon  adding,  then,  XZac  to  (68)  and  (69)  we  find 
that  they  become 


(1  -J-  X)  Za 


(1  -f  A,r)(l  -I-  hihthtr) 
(1  4-  r)(l  -I-  W) 


r  XZ«  —  "J  (1  4"  4"  h\hthtr) 

Ll  +  r  (1  4-  Mr) 

[1  4~  (I  4~  ^)  H  (I  4~  5ir]  [1  -4-  hikthtr] 
[1  4-  r)tl  4-  Mir] 


The  transformation  is  completed  by  replacing  all  of  the  Z.  impedances 
by  1/1  4-  X  times  their  previous  values,  leaving  the  Zi  impedances  un¬ 
changed.  This  changes  the  open-  and  short-circuit  impedance  expres¬ 
sions  to 


360 


H.  W.  BODE 


and 


Z6 


(14-r) 

[■  +  r^x'] 

['  +  i  +  x'J 

from  which  the  transfer  constant  of  the  new  network  is  given  by 

|/ -  tanh  e,  ; 


tanh  6 


(73) 


(74) 


while  if  we  let  ht 


1 

1  -f  X 


its  image  impedance  can  be  written  as 


^  Vl  4-  r  (1  +  hth4r)i\  4-  hih^Jiir) 

’  -  V  z.z» 


_  (14"  h^ir){l  4"  hihth^if) 

'  (1  4-  A«r)(l  4* 


(75) 


These  transfer  constant  and  image  impedance  characteristics  are,  of 
course,  those  of  the  fourth  order  A-derived  structure  which  we  seek. 
The  same  general  analysis  can  obviously  be  used  also  for  the  addition 
of  an  open-circuit  impedance  at  the  mid-series  end  or  for  the  simultane¬ 
ous  addition  of  open-  and  short-circuit  impedances.  Since  nothing  in 
the  analysis  depends  upon  the  fact  that  we  assumed  that  the  initial 
network  was  of  the  third  order,  our  conclusions  are  equally  valid  for 
A-derived  structures  of  any  order.  It  follows  at  once  that  if  we  start 
with  the  first  order  structure,  whose  existence  has  been  established 
directly,  an  infinite  chain  of  A-derived  networks  can  be  obtained.  By 
making  u.se  of  the  substitution  process,  either  before  or  after  the  A- 
derivation,  therefore,  we  can  construct  single  root  structures  with  any 
physically  possible  set  of  impedance  controlling  frequencies. 


CONCLUSION 

With  the  development  of  A-derived  sections,  our  formal  theoretical 
analysis  of  filters  comes  to  an  end.  In  effect,  it  will  be  recalled,  the 
analysis  began  with  the  statement  of  the  positive  definiteness  of  the 
network  energy  functions.  These  were  obviously  necessary  conditions 
for  physically  realizable  networks,  but  they  were  not  in  a  form  which 
made  their  significance  in  filter  theory  readily  apparent.  From  them, 
however,  it  was  possible  to  determine  a  number  of  other  conditions  by 


GENERAL  THEORY  OF  ELECTRIC  WAVE  FILTERS 


361 


means  of  which  the  image  parameters  of  physically  realizable  filters 
could  be  restricted  to  certain  rather  narrowly  defined  analytic  forms. 
We  were  led  finally  to  the  conclusion  that  the  most  general  physically 
realizable  filter  could  be  regarded  analytically  as  a  combination  of  a  few 
hypothetical  elementary  structures,  which  had  comparatively  simple 
and  definite  characteristics.  This  was  as  far  as  our  direct  investigation 
of  the  consequences  of  the  positive  definiteness  of  the  energy  function 
could  take  us  and  it  gave  us,  of  course,  only  a  list  of  necessary  conditions. 
The  second  half  of  the  paper,  however,  established  the  existence  of  a 
definite  physical  structure  corresponding  to  each  of  the  theoretically 
determined  elementary  constituents  of  the  general  filter  and  thus  showed 
that  the  positive  definiteness  conditions  with  which  we  started  were 
sufficient  as  well  as  necessary. 

In  a  certain  sense,  it  is  fair  to  say  that  the  theory  thus  developed  is 
exhaustive.  There  are  no  characteristics  physically  obtainable  from 
ordinary  networks  which  the  anal3rsis  does  not  give,  and  there  is  no  way 
of  breaking  filters  down  into  elementary  constituents  which  are  them¬ 
selves  filters  and  which  are  more  primitive  than  the  constituents  given 
in  the  paper.  In  a  broader  sense,  of  course,  many  problems  remain  to 
be  answered.  We  might  consider,  for  example,  the  characteristics  which 
could  be  obtained  if  we  had  adopted  some  other  definition  of  filters. 
Even  if  we  adhere  to  our  present  definition  several  questions  are  raised 
when  we  try  to  apply  the  method  practically.  One  which  has  already 
been  suggested,  for  example,  deals  with  the  arrangement  of  the  ele¬ 
mentary  sections  in  the  general  composite  structure  to  secure  the  most 
economical  overall  network.  Other  obvious  questions  are  concerned 
with  possible  alternative  physical  embodiments  of  the  elementary 
sections  and  with  the  choice  of  parameters  required  to  obtain  desirable 
external  characteristics  from  filters. 

The  discussion  of  such  problems  as  these  is  beyond  the  scope  of  this 
paper.  The  primary  object  of  the  paper  is  to  construct  the  systematic 
theoretical  organization  of  filters  and  filter  characteristics  which  we  have 
just  completed.  Before  leaving  the  subject,  however,  we  may  consider 
briefly  two  consequences  of  the  analysis  whose  practical  applications 
are  more  immediate.  The  first  result  is  the  fact  that  in  the  A-derived 
and  complex  m-sections  our  discussion  points  to  two  types  of  structures 
having  novel  characteristics.  The  A-derived  sections  can  be  dismissed' 
briefly.  In  their  impedance  properties  they  resemble  the  lattice,  while 
in  their  transmission  properties  they  are  simply  well-known  ladder  sec¬ 
tions.  Their  characteristics  are  thus  not  individually  novel.  The 


362 


H.  W.  BODE 


novelty  of  the  sections  arises  simply  from  the  fact  that  they  furnish 
impedance  and  transmission  characteristics  in  combinations  which 
were  not  previously  obtainable.  The  complex  m-sections  on  the  other 
hand,  furnish  novel  phase  and  attenuation  characteristics  and  thus 
considerably  increase  the  arsenal  of  devices  at  the  disposal  of  the 
designing  engineer.** 

A  second  important  practical  result  of  the  theory  is  the  increased 
ability  it  gives  us  to  convert  filter  designs  from  one  configuration  to 
another  which  may  be  more  favorable  for  purposes  of  ph3r8ical  con¬ 
struction.  A  good  example  is  furnished  by  the  lattice.  As  we  have 
mentioned  several  times  previously,  the  lattice  has  considerable  ad¬ 
vantages  in  generality  and  theoretical  simplicity  over  ladder  networks. 
Unfortunately,  however,  these  advantages  have  in  the  past  been  almost 
unavailable  for  practical  purposes  because  of  the  great  difficulty  in 
constructing  most  lattice  filters  with  sufficient  precision.  The  difficulty 
arises  from  the  fact  that,  since  the  lattice  is  a  bridge,  the  attenuation 
which  a  lattice  filter  provides  depends  upon  the  balance  between  the 
bridge  arms.  To  secure  a  high  attenuation  a  very  close  balance  is 
necessary.  For  example,  if  a  lattice  structure  is  to  provide  at  least  60  db 
attenuation,  tanh  9/2,  or  the  square  root  of  the  ratio  of  the  Z,  and  Zy 
branches,  must  be  held  between  the  limits  0.998  and  1.002.  The 
elements  which  enter  into  the  structure  must  therefore  be  made  with 
extreme  accuracy.  On  the  other  hand,  if  the  structure  can  be  broken 
down  into  two  parts,  each  supplying  only  30  db,  tanh  9/2  for  each  of  them 
may  vary  between  the  limits  0.94  and  1.06,  which  corresponds  to  element 
accuracies  easily  obtained  in  practice.  By  following  the  present  theory 
such  conversions  from  a  complicated  lattice  to  a  number  of  simple 
structures  in  tandem  can  readily  be  made.  It  is  thus  possible  to  pre¬ 
serve  the  advantages 'of  the  lattice  for  theoretical  work  and  still  arrive 
at  a  final  configuration  which  will  be  satisfactory  in  practice. 

**  Some  indication  of  the  characteristics  furnished  by  complex  m-sections  can 
be  secured  from  the  paper  on  “Ideal  Filters." 


A  SIX  COLOR  PROBLEM* 

Bt  Philip  Franklin 

1.  Introduction.  The  problem  of  coloring  a  particular  map  in  such 
a  way  that  any  two  countries  contiguous  along  an  edge  have  distinct 
colors  leads  to  a  more  general  question.  For  any  surface,  what  is  the 
smallest  number  of  colors  sufficient  to  color  any  map  drawn  on  it? 
For  surfaces  topologically  equivalent  to  the  plane  or  sphere  it  is  known 
that  four  colors  are  sometimes  necessary,  and  that  five  will  always 
suffice.  Whether  five  are  ever  actually  needed  is  an  open  question.* 

The  problem  for  surfaces  other  than  the  sphere  was  investigated  by 
Heawood.*  He  found  a  formula  for  a  number  of  colors  sufficient  for 
any  maps.  This  same  number  was  also  shown  to  be  an  upper  limit  to  the 
number  of  regions  on  the  surface,  each  of  which  touches  all  the  rest. 
Evidently  a  set  of  neighboring  regions  of  this  type,  regarded  as  a  map, 
requires  a  new  color  for  . each  country.  Thus,  whenever  an  example 
is  at  hand  showing  that  the  upper  limit  of  Heawood  is  actually  reached, 
the  number  of  his  formula  is  necessary  as  well  as  sufficient,  and  the 
map  coloring  problem  is  solved  for  this  type  of  surface. 

As  such  examples  are  known  for  many  of  the  earlier  cases,  it  is  often 
assumed  that  the  map-coloring  problem  has  been  solved  for  all  surfaces 
other  than  the  sphere.  However,  in  this  note  w’e  show  that  for  the 
one-sided  (non-orientable)  surface  of  characteristic  zero,  while  the  Hea¬ 
wood  number  is  seven,  the  maximum  number  of  neighboring  regions 
is  in  fact  six,  so  that  his  investigations  leave  the  map  coloring  problem 
unsolved  for  this  surface.  We  further  show  that  for  maps  on  this  sur¬ 
face,  six  colors  are  always  sufficient.  Thus  Heawood’s  formula  is  incor¬ 
rect  for  this  surface,  and  may  also  fail  in  other  cases.  The  situation 
on  the  particular  surface  studied  throws  light  on  the  possibilities  in  the 
case  of  the  plane,  i.e.  the  four-color  theorem. 

2.  The  Heawood  reasoning.  We  begin  by  recalling  Heawood’s  argu¬ 
ment.  Consider  a  closed  surface  (two-dimensional  manifold)  of  char- 

>  Presented  to  the  American  Mathematical  Society. 

*  For  the  status  of  the  problem,  and  references  to  the  literature  see  two  papers 
by  C.  N.  Reynolds  in  the  Annals  of  Mathematics,  vol.  28  (1927),  p.  1  and  p.  427. 

P.  J.  Heawood,  Quarterly  Journal  of  Mathematics,  vol.  24  (1890),  p.  332. 

363 


364 


PHILIP  FRANKLIN 


acteristic  K.  Then*  for  any  subdiviaion  into  simply-connected  regions, 
the  number  of  regions  (oi)  edges  (ai)  and  vertices  (ao)  are  related  by 

(1)  oo  —  Oi  -I-  oi  «  X. 

Without  essentially  changing  our  problem,  we  may  restrict  ourselves  to 
regular  maps,  i.e.  those  in  which  only  three  edges  meet  at  a  point,  and 
in  which  all  the  regions  are  simple.  If  such  a  map  has  An  regions  which 
are  polygons  of  n  sides,  we  have,  since  each  edge  touches  two  regions, 
and  each  vertex  meets  three  ends  of  edges, 

(2)  Af^An,  2a,  -:2n.4»,  3ao  - 

It  follows  from  (1)  and  (2)  that 

\ZnAn  -  \l.nAn  -|-  2^4,  -  /C 
or 

(3)  2(6-nM, -6X. 

If  we  call  the  average  number  of  contacts  of  regions  with  other 
regions  N,  we  have 

(4)  NZAn  =  ^nAn 

and 

(5)  (6  -  Ar)2^,  -  e/c. 

We  next  observe  that  if  a  map  contains  a  single  region  of  p  —  2  or 
less  sides,  it  is  reducible  with  respect  to  p  —  1  colors.  For,  if  we 
shrink  the  single  region  to  a  point  and  color  the  resulting  map  in  p  —  1 
colors,  the  p  —  2  abutting  regions  will  not  exhaust  all  the  colors,  so  that 
one  will  be  left  for  the  single  region  when  it  is  replaced.  Consequently, 
unless  all  maps  are  reducible  with  respect  to  p  —  1  colors,  in  which 
case  p  —  1  colors  would  suffice,  there  will  be  some  map  all  of  whose 
regions  contain  at  least  p  —  1  sides.  Let  us  now  assume  that  p  colors 
are  required,  and  consider  the  map  just  shown  to  exist.  Let  it  have  q 
regions,  each  of  which  touches  p  —  \  ki  others.  Then,  analogous  to 
(2)  above,  we  have 

(6)  a,  =  9 ,  2a,  =  2(p  -  1  -I-  *<) ,  3ao  -  2(p  -  1  -|-  ki) 

*  See  e.g.  8.  Leferheti,  Topology,  New  York,  1930,  p.  44,  orO.  Veblen,  Analysis 
Situs,  New  York,  1031,  p.  54. 


SIX  COLOR  PROBLEM 


365 


and  from  (1)  and  (6) 

(J  -  i)S(p  -  I  + 

or,  as  the  summation  runs  over  q  terms, 

6?  -  9(p  -  1)  -  Zki  =  6/C 

and  hence 

(7)  6/C  +  9(p-7)  -  -Zki^O. 

If  P  ^  7,  an  assumption  we  make  provisionally,  p  —  7  ^0,  and  since 
q  ^  p,  we  deduce  the  relation 

6/C  4-  p(p  -  7)  ^  0 

or 

(8)  p*-7p  +  6/C^0. 

This  requires  that  p  be  situated  between  the  roots  given  by 
2p  =  7  ±  V49  -  24Jfc , 

so  that 

(9)  P  ^  §(7  4-  V49  -  24/C). 

This  limit  is  ^  6  for  /C  ^  1.  But  K  is  two  for  the  sphere,  one  for  the 
projective  plane  and  is  negative  for  all  other  surfaces.  Consequently, 
except  for  the  sphere,  the  formula  gives  an  upper  limit.  For,  either 
p  ^  7  in  which  case  our  argument  holds,  or  p  <  7  and  integral  so  that 
any  number  ^  6  is  certainly  an  upper  limit. 

Consider  next  a  map  of  neighboring  regions,  p  in  number,  each  of 
which  touches  the  remaining  ones  (possibly  along  more  than  one  edge). 
For  such  a  map  we  have  to  set  9  p  in  (6)  and  (7)  which  then  reduces 
to  (8)  directly.  Thus  the  inequality  (9),  regarded  as  an  upper  limit 
to  the  number  of  neighboring  regions  on  a  surface  holds  for  all 
values  of  K. 

3.  Conclusions.  Heawood  had  in  mind  two-sided  surfaces  (orientable 
two-dimensional  manifolds),  but  his  reasoning  applies  equally  well  to 
the  one-sided  case,  as  many  later  writers*  have  observed.  Thus,  if  we 
set 

Pk  -  [i(7  +  V49  -  24/C)] , 

*  H.  Tietze,  Deutsch.  Math.  Verein.  Jahresber.,  vol.  19  (1910),  p.  155. 


36G 


PHILIP  FRANKLIN 


where  as  usual  [x]  means  the  greatest  integer  in  x,  we  find: 

K  ^2,  1,0,  -1,  -2,  -3,  -4  ... 

P*  =  4,  6,  7,  7,  8,  9,  9  . . .  . 

If  the  maximum  possible  number  of  neighlx)ring  regions  is  actually  Pk, 
we  have  a  map  requiring  P*  colors.  But,  except  for  K  »  2,  we  have 
seen  that  P*  colors  suffice.  Hence,  if  the  examples  can  be  found,  the 
map  coloring  problem  is  solved  by  Heawood’s  formula.  Such  examples 
are  known  for  all  the  two-sided  cases*  up  to  K  =  —  6,  and  for  the  pro¬ 
jective  plane*.  A'  =  1. 

In  passing  we  recall  that  when  K  =  2,  equations  (3)  or  (5)  show  that 
some  regions  contain  five  or  less  sides,  and  a  well-known  reduction 
enables  us  to  prove  that  five  colors  are  sufficient  for  this  case. 


4.  Some  Examples.  To  lead  up  to  other  cases,  let  us  consider  the 
simplest  examples  of  neighboring  regions.  For  A  «  2,  Pt  «  p  4. 
The  equality  shows  thht  the  ki  of  (7)  are  all  zeros,  so  that  each  of  the 
four  regions  has  exactly  three  sides.  The  map  is  the  tetrahedron  shown 
in  Fig.  1,  2. 

Again,  for  A  1,  Pj  »  p  s  6,  so  that  all  of  the  ki  =  0,  and  each  of 
the  six  regions  has  exactly  five  sides.  To  see  if  such  a  map  exists,  we 
draw  one  such  region  (6)  adjacent  to  five  others,  (1,  2,  3,  4,  5)  as  in 
Fig.  2.  If  now  the  free  edges  are  paired  as  indicated,  we  have  a  mani¬ 
fold  with  A  ■=  1.  Moreover,  they  must  join  in  this  way  to  make  each 
region  touch  all  the  rest,  so  that  the  example  is  unique. 

Next,  let  A  =  0.  Then  Po  *  p  =  7,  another  case  of  equality  with 

*  L.  Heffter,  Math.  Annalen,  vol.  38  (1891),  p.  477. 


SIX  COLOR  PROBLEM 


367 


all  of  the  hi  «  0.  Thus  the  map  of  neighboring  regions  is  here  one  of 
seven  hexagons.  If  the  center  one  is  7,  and  those  adjacent  1,  2,  3,  4, 
5,  6,  as  shown,  the  outer  regions  A,  B,  etc.  which  are  these  repeated, 

C 


are  restricted  as  indicated.  Thus  A  must  be  3, 4  or  5,  since  1  is  already 
in  contact  with  6,  7  and  2.  Again,  B  must  be  4  or  5,  since  it  touches 
1  and  2,  one  of  which  already  has  contact  with  6,  7  and  3.  Similarly 


368 


PHILIP  FRANKLIN 


for  the  rest.  But,  if  B  is  5,  must  be  6,  to  give  a  new  contact  for  C, 
and  C  must  be  4.  This  necessitates  that  F  be  1,  and  E  he  5.  Con¬ 
tinuing  in  a  clockwise  direction  in  this  way,  we  find  the  contacts  all 
determined,  and  we  have  the  map  of  hexagons  shown  in  Fig.  4.  If  we 
took  B  «  4,  and  went  counter-clockwise,  we  should  find  a  similar  map. 
Thus  this  example  is  essentially  unequally  determined.  As  it  is  on  a 
torus,  or  two-sided  surface  with  A  »  0,  we  see  that  for  one-sided 
surfaces  with  K  ^  0,  the  maximum  number  of  neighboring  regions  as 
given  by  the  Heawood  formula  is  not  realized. 

5.  Reducibility  in  Six  Colors.  Since  a  one-sided  surface  with  K  0 
(or  Klein  sack)  may  be  built  up  of  two  projective  planes  (surfaces  with 
K  »  1),  the  map  of  Fig.  2  leads  to  a  related  one  on  a  Klein  sack  requiring 
six  colors.  Thus  since  an  example  requiring  seven  colors  of  the  simplest 
kind  is  not  at  hand,  it  may  be  that  six  colors  are  sufficient.  Let  us 


consider,  then,  reductions  of  maps  on  surfaces  with  K  «  0  when  six 
colors  are  used. 

If  the  map  has  any  region  of  five  sides  or  less,  it  is  reducible  by  the 
argument  given  after  equation  (5).  But,  by  equation  (5),  when  K  —  0, 
the  average  number  of  contacts  iV  «  6,  so  that  if  no  regions  of  less 
than  six  sides  are  at  hand,  none  of  more  than  six  sides  can  be  present. 

Consider,  then,  a  map  all  of  whose  regions  are  hexagons.  Let  one 
of  them  be  7,  as  indicated  in  Fig.  5,  surrounded  by  regions  1,  2, 3, 4,  5,  6. 
The  map  is  obviously  reducible  if  these  are  not  all  distinct.  Again, 
suppose  that  1  is  not  in  contact  with  4.  By  erasing  boundaries  as 
shown  in  (Mg.  6,  and  coloring  the  resulting  map,  region  seven  touches 
others  of  at  most  five  colors,  so  that  a  sixth  is  left  for  it  and  the  map 
is  reducible.  A  similar  argument  shows  that,  if  our  map  is  irreducible, 
1  must  touch  3  and  5,  so  that  it  is  in  contact  with  3,  4,  and  5.  As  it  is 
a  hexagon,  it  has  no  other  contacts  beyond  those  indicated  with  2,  7 


SIX  CX)LOR  PROBLEM 


369 


and  6.  Similarly,  for  the  other  regions,  so  that  the  map  consists  of 
exactly  seven  hexagons  each  of  which  touches  all  the  rest.  That  is, 
it  is  the  map  of  Fig.  3,  or,  essentially,  that  of  Fig.  4. 

This  proves 

Theorem  I.  Every  map  on  a  manifold  with  K  ^  0,  except  that  of 
seven  mutually  contigvums  hexagons,  is  reducible  with  respect  to  six  colors. 

Theorem  II.  Every  map  on  a  Klein  sack  is  colorable  in  six  colors. 

6.  Relation  to  the  four  color  problem.  The  map  coloring  problem  on 
other  surfaces  has  no  logical  connection  with  that  for  a  sphere,  but  cer¬ 
tain  of  the  facts  brought  out  for  the  case  K  ^  0,  illustrate  possibilities 
for  the  unsolved  case. 

In  the  first  place,  it  previously  seemed  that  the  formula  of  Heawood 
gave  the  correct  number  for  all  cases  except,  perhaps,  that  of  the  sphere. 
It  now  appears  that  the  formula  certainly  fails  sometimes.  Hence,  if 
the  four  color  theorem  is  false,  it  is  not  a  unique  exceptional  case. 

Again,  the  situation  on  the  anchor-ring  is  illuminating.  Here  seven 
colors  are  required,  but  there  is  only  one  map  not  reducible  with  respect 
to  six.  Of  course,  an  infinite  number  are  not  colorable  in  six  colors. 
If,  on  the  sphere,  we  had  a  similar  situation,  all  maps  would  be  reducible 
in  four  colors,  except  a  single  one.  If  this  had  a  large  number  of 
regions,  say  over  one  thousand,  the  theorem,  while  impossible  to  prove, 
would  be  exceedingly  difficult  to  disprove. 


?»:>  -•"*■*■  ^  ■  '■  w  <  ' 

j.. V  .  i  -  '  VTv;., 

’  ■  *  ,  ■  ■.  •  ■  5  <  ‘  /  ■  ■  '■’  %,--.  •■ 

'■  '•-■■!■.:■■■■  /'Taej 


■  .  ',.;  f.  -»•“■ 


li-^,  \  •  i.  '•  ,  .V.-C-V'*  y<*»  .;  ,..  V*  «»♦/•; «  Ivt/ 

,'>.  f -9  f,l  -’'(f'  ,  *  K"'*  I'.'- *  ’,,'*'  '  \  ‘  ^ 

lUj^.vg  ■  .<.*'••  •)^}.  •**  ^* 


■•'*'.■  '  ,v  ■.'■'V  3"  'i' 

P-v.-  - 


1?  -rt  ;,•  -'^  W  »•-> 


•■',  '•-.i.  V'  •  Vr**-''''  '  .%‘M'' ,;.'rrtT 

.'■'v  3" 'i' ■  ‘^>y*  .•' 

yl-:--,-,-  ••.,  <,.  •<'*'»  •< :  ■■;■«**'•  • 

.  .  •'**•  ■■■  ■'  •**■  «*' 

*  ■  •  ‘  '■  •'  ^  ^  ? 

w.v.  ■*  •.•  ••■'*.  r-j  v"’-'  *' 

^  ..  ,H>'  ;•  ■•'•.f  •.tri_;;vf  ‘  •. 

■f.rn’ • 

•>  /-'  ■'■•  «  ,-  ■  '  ■■  •Hi.  ■  -■.■■■  Vvy?>|».C  ''"C 

■f'i’  '  ■' J  i  *  ^  j  •  ._^^  •  'V 

I  y  ,  "* 'iJ*V  '"*  -V'-*, >'.' ’r'' 

■■  •',.  i*  *  -  *  •'-  .»'■'  'T-* 

:  .1.'  ••  -  •.i-ii,,  '  -iv  .  ',  ^■'  r 

■  ■•,■  ■•*/■  ^V'K'  -  -W  ’  ■■'  .  •'ri'iy  ■  ■'  ^  .’. 

'.v‘  t'  .''  'i.',/  ’.'<-1 

'C^:  .r'  -  •••' ‘.'.f-  ,>r’<v-^-* 

_  yKi'  'y>-  '’•*.• 


■  V  "  - 


;  K.-fipyl 


THREE  MATHEMATICAL  METHODS  OF  ANALYZING 
POLARIZED  LIGHT 

Bt  Dohotht  W.  Wkckb* 

I.  INTRODUCTION 

The  problem  of  a  single  beam  of  polarized  light  has  been  mathemati¬ 
cally  treated  by  Poincar^,‘  Tuckerman*  and  Wiener.*  Poincar6  stere- 
ographically  projected  his  complex  relations  on  a  sphere,  Tuckerman 
confined  his  treatment  to  trigonometric  relations,  and  Wiener  used  his 
coherency  matrices.  The  purpose  of  this  paper  is  to  correlate  these 
methods  and  to  extend  the  methods  of  Tuckerman  and  Wiener  to  the 
case  of  two  beams  of  polarized  light. 

The  correlation  is  shown  by  determining  the  invariants  involved. 
This  determination  is  brought  under  the  following  theorem:  If  the 
axes  of 


A4  {  -f-  Bi  1?  -h  C’l  ■■  0 

are  rotated  through  an  angle  8,  then  AiAi  -{-  Bi Bi  and  AiBi  —  Ai  Bi  are 
invariant  under  this  rotation.  The  parametric  wave  equations 


|i  -  A 1 

m  -  B,  ««»*+♦»> 


can  be  brought  into  the  form  of  equations  (1)  as  follows: 

-  lu  A  -0 

(3) 

|i  Bi  —  0. 

From  this  it  is  readily  seen  that  A\  B\  and  AiBi  sin  (^1  —  ^)  are 
invariant. 


*  Wilson  College,  Chamberaburg,  Pennsylvania. 

*  Poincar^.  La  Lumihrt.  Chap.  II  and  XII. 

*  L.  B.  Tuckerman,  Jr.  The  Trantmietion  of  Light  Through  Doubly  Refracting 
Platee  mlh  Applieatione  to  Slliptic  Analyeing  Syeleme.  University  of  Nebraska 
Studies,  Vol.  IX,  No.  2,  April,  1900. 

*  N.  Wiener.  Coherency  Matricee  and  Quantum  Theory.  M.  I.  T.  Journal  of 
Mathematics  and  Physics,  Vol.  VII,  No.  2,  June,  1928. 

371 


I’lWSiCj  UBRARY 


372 


DOROTHY  WEEKS 


II.  tuckerman’s  method 

Tuckerman  considered  a  plane  wave  of  monochromatic  elliptically 
polarized  light  falling  normally  on  a  senes  of  plane  parallel  doubly 
refracting  plates.  The  axes  of  reference  chosen  in  each  plate  are  the 
planes  of  polarization  of  the  ordinary  and  extraordinary  vibrations. 
The  parametric  wave  equations  (2)  represent  the  beam  of  light  referred 
to  the  first  plate,  where  Ai  and  Bi  represent  the  amplitudes  of  the 
ordinary  and  extraordinary  vibrations  respectively,  and  (^i  —  the 
phase  lag  of  the  extraordinary  over  the  ordinary.  If  we  submit  and 
m  to  a  rotation  through  the  angle  u  where  u  represents  the  angle  between 
the  reference  axes  of  the  first  and  second  plates,  we  obtain  the  displace¬ 
ments  referred  to  the  axes  of  the  second  plate.  The  passage  of  this 
light  through  a  series  of  such  plates  rotates  the  components  of  the  beam 
through  an  angle  »  for  n  plates.  Under  such  a  rotation, 

i4 1  -|-  and  sin  (^i  —  are  invariant.  That  is,  the  sum  of  the 
energies  of  the  two  wave  components  as  well  as  the  mutual  energy 
between  these  two  components  are  invariants  under  a  rotation.  Follow¬ 
ing  the  notation  of  Tuckerman,  let 

A*  ~2P  AB  cos  ~  K 

A'  -  B*  ^2Q  AB  sin  (^  -  ^)  -  S. 

Therefore  P  and  S  are  the  invariants  under  a  rotation. 

III.  METHOD  OF  P0INCAR£ 

By  the  elimination  of  the  time  factor  from  the  parametric  wave 
equations,  the  resulting  equation  is  that  of  an  ellipse.  Poincar4  starts 
from  this  point  of  view. 

Let  , 

-  B  tC  K  u  -f-  IP 

and  (5) 

W  ^  u  —  iv 

M*  -{=  p*  -  tc®  -  (6) 

A* 

u  »  ^  ^  ^  -  «  cos  (^  -  0)  (7) 

IT  —  O 

~W~ 


V 


€  sin  -  4)- 


(8) 


ANALYZING  POLARIZED  LIGHT 


373 


If  the  axes  of  the  ellipse  make  an  angle  6  with  the  coordinate  axes,  {and 
If  are  turned  through  an  angle  0.  From  the  foregoing  considerations 
it  is  shown  that  if  and  n*  represent  the  projections  of  the  vibrations  on 
the  axes  of  the  ellipse 

—  {  sin  g  1>  cos  0 
{  cos  g  +  9  shi  ^ 

u  tp  —  tan  g 
1  -H  (u  -|-  iv)  tan  g ' 

If  g  is  kept  constant  while  t  varies  from  —  »  to  -J- «  the  point  (u,  p) 
describes  a  circle. 


IT,  where  r  ^  U|  -f  tp  i 


(9) 


f  1  .  (m  -  tan  g)*  +  P* 

*  ‘  "  (1  -H  u  tan  g)*  +  p*  tan*  B 

(10) 

^  (u*  4-  p*)  tan  g  -f  u(l  —  tan*  g)  —  tan  g 
(1  4- utang)*  +  p*tan*g 

(11) 

p(l  —  tan*g) 

”  p*  tan*  g  4-  (1  4-  u  tan  g)* 

(12) 

if  Ui  »  0  then 

(13) 

which  is  the  equation  for  a  circle. 

Similarly,  if  r  is  held  constant  and  g  allowed  to  vary 

^  -  u  4-  i(p  —  t) 

(14) 

(1  —  TP)*  4-  r*u* 

(15) 

u(l  —  tp)  4-  Tu(p  —  t) 
(1-TP)*+T.U* 

(16) 

(p  -  t)(1  _  tp)  -  tm* 

^  “  (1  -  tp)*  4-  t*u*  “  ■ 

Therefore 

(17) 

tt*  4-  p»  _  i/t  4-  M  -  -1 

(18) 

which  is  also  an  equation  for  a  circle. 

The  two  circles  (13)  and  (18)  intersect  orthogonally. 

374 


DOROTHY  WTEKS 


The  ellipticity  is  given  by  the  value  of  v  and  the  angle  0  by  u.  The 
representation  on  a  plane  is  then  stereographically  projected  on  a  unit 
sphere,  the  origin  0,  being  the  point  of  contact  of  the  sphere  with  the 
plane  u,  v.  The  u  axis  is  projected  into  a  great  circle  called  the  equator, 
and  the  v  axis  into  a  great  circle  orthogonal  to  the  first  called  the  first 
meridian.  Thus  the  two  effects  of  double  refraction  and  power  of 
rotation  when  superposed  may  be  represented  by  the  rotation  of  the 
Poincar^  sphere  about  some  axis.  It  is  possible  to  give  physical  inter¬ 
pretations  to  groups  of  rotations  on  this  sphere. 

Tuckerman  has  shown  that  the  sphere  defined  by  Q*  -|-  fC*  5*  «  P* 
is  the  Poincar^  sphere. 

IV.  TWO  POLARIZED  BEAMS  OF  LIGHT — TUCKERMAN’S  METHOD 

Two  monochromatic  elliptically  polarized  rays  of  light  falling  normally 
on  a  series  of  plane  parallel  doubly  refracting  plates  may  now  be  con¬ 
sidered. 

Let 


L  -  ,  Vi  -  ,  i  «  1,  2 

i  -  Ae*^**+*> 

n  *  m  +  n*  “ 

From  consideration  of  the  real  parts  only  it  is  found  that 

A*  «  i4ii  -f-  2Aii  -}-  Aft 

B*  =  Bii  -|-  2Bu  +  Pm 

AB  cos  (^  —  0)  »  A*n  -|-  Kit  +  Kti  +  A'n  »■  K 
A  B  sin  —  4>)  ^  Sii  +  <Sij  +  <S*i  Pm  ■■  P* 


(19) 

(20) 

(21) 

(22) 

(23) 

(24) 


Where 


All  =  AiPi  cos  (^1  -  <t>i) 
Kit  »  ^1  Bt  cos  (it  —  <fn) 
Kti  -  At  Bi  cos  (4>j  -  ^i) 
Am  =  AtBt  cos  (^i  -  4»j) 


(25) 


ANALYZING  POLARIZED  LIGHT 


375 


Sn  «  ill  Bi  sin  (}ffi  — 

Si*  «  iiiB*  sin  —  ^i) 

S*i  «  il*  Bi  sin  —  ^i) 

Sn  •  At  B,  sin  (^,  -  0*) 


(26) 


ilii  *  1 

ill*  *=  ill  .A*  cos  —  ^i) 

il*i  »  ill* 

ii**-ii; 

Sn  -  SI 

B„  -  B,  B,  cos  (^,  -  ^i) 
S*i  “  Sh 
S**  -  s* 


(27) 


(28) 


Qi*  ^  iliil*  sin  (4>i  —  ^i)| 

0i»  “  Si  B*  sin  (^*  —  ^i) J 

For  each  ray  of  monochromatic  light  let  the  wave  equations  be 
=  +  ♦/ -  1/ -  ffl,- 


(29) 

(30) 


Then 


«  iltf  +  By/  +  2ilyBy  Sin  (tffj  -  ^y) 
-2(Byy  +  Syy). 


(31) 


In  the  case  of  two  rays  of  light 

♦  =  ^'i  +  I'* ,  ♦-♦!  +  ♦* 

_  _  _  _  _  (32) 

'M'  =  'Pil'i  +  ♦I'P*  -f  ♦I'P*  +  '!'*♦*. 

Since 

♦I'i'i  -J-  ♦I'l'*  *  2 (ill*  Bit  +  Si*  +  S*i)  (33) 

a=  ilii  -)-  Bn  -)-  2  (Sn  4*  Si*  +  S*i  4-  S**) 

4"  2  (ill*  4"  Si*)  4"  An  4"  S**  (34) 

„  B*  4-  2S 

-  2(P  +  S).  (35) 


376 


DOROTHY  WEEKS 


Thus  it  is  seen  that  is  invariant.  This  function  is  a  light  energy 
function,  and  may  be  said  to  correspond  to  the  function  of  electric 
density  or  probability  in  quantum  mechanics. 

It  is  interesting  to  note  that  the  optical  density  is  equal  to  the  sum  of 
the  energies  of  the  two  components  increased  by  twice  the  mutual 
energies,  and  is  therefore  analogous  to  the  energy  relations  in  a  directly 
coupled  electrical  circuit. 

In  this  case  there  are  seven  invariants. 


i4u  -1-  Bii  i4ii  -|-  Bit  Sti  Sit 
-f  Btt  Stt  Sti. 


(36) 


In  the  problem  of  two  rays  of  light,  a  new  element  is  introduced — the 
conditions  for  interference. 

If  B  the  rays  are  in  phase, 

then  «  (-di  +  i4*)*  and  the  rays  are  in  resonance. 

If  B  ^  ±  T  the  rays  are  in  interference  and  A*  «  (ili  —  ^4*)*. 

If  Ai  ^  At,  A*  ^  0  and  complete  interference  results. 


V.  wiener’s  method 

Wiener,  in  making  a  harmonic  analjrsis  of  his  wave  functions,  forms  a 
function 

Rikiv)  -  «,*(u)  -  2t  *f:’  Ai,Au,  (37) 

where 

Mt)  -  i:  .  (38) 

,  ■  ‘ 

The  matrix 

Rfh(—Ak  -H  0)  “  /?,•*(— A*  —  0)  »  2t A/kAuk 

is  defined  as  a  coherency  matrix. 

Therefore,  if  the  /y(()  functions  are  the  parametric  wave  equations  for 
polarized  light,  the  coherency  matrix  is 

AA 

«  «  (39) 

where  A  A  +  BB  and  A  are  invariant. 


ANALYZING  POLARIZED  LIGHT 


377 


A  A  BB,  the  total  energy  of  the  two  component  equals  Tuckerman’s 

2P. 

Taking  the  real  part  of  e*^*~*^  the  matrix  is 

AA  AB  cos  (^  —  ^) 

AB  —  4)  BB 


(40) 


where 


A  =  AA  BB  sin*(^  —  0)  a=  jS* 


(41) 


that  is,  A  equals  the  mutual  energy  squared. 

To  extend  this  method  to  the  case  of  two  simultaneous  polarized  rays 
of  coherent  light, 
let 


(42) 


then 


•  Mt)' 

Vi 

•  m 

h 

-/.(O  ’ 

m 

<A*» 

where 

A, 


(43) 


and 


4>,*(t) 


111 

if  A-/. 


(44) 


The  general  coherency  matrix  for  this  case  is  then 
.4ii4, 

B,5i 

(45) 

AiBter*^**~**^  BiBie~*^**~**^  BjBt 

AiAi  BiBi  -f-  AfAt  BtBt  is  invariant  as  well  as  the  determinant  of 
the  matrix,  A  »  0. 


378 


DOROTHY  WEEKS 


The  real  part  of  the  matrix  then  takes  the  following  form. 


■djl 

/Cii 

An 

Kn 

/Cll 

B„ 

Ku 

Bn 

An 

Ku 

Kn 

Kn 

B» 

Kn 

Bn 

(46) 


where  A  «  —  4  <Su  Sti  . 

This  is  invariant  since  Su,  Sn,  Sn  and  Sn  are  all  invariant.  It  has 
been  shown  that  i4ii  +  Bn  +  i4tt  +  Bn  +  Zdu  -f  2Bit  is  invariant. 
Consequently,  the  value  of  the  determinant  as  well  as  the  sum  of  the 
terms  in  the  main  diagonal  of  the  matrix,  and  of  the  two  minors  that 
contain  no  terms  of  the  main  diagonal,  are  invariant. 

In  the  case  of  incoherent  light,  h  ^  I  in  the  4>,jk(r)  function.  This 
means  that  there  is  no  persistent  definite  frequency  relation  between  the 
corresponding  components  of  the  two  polarized  rays  of  light,  i.e.,  this 
is  the  case  of  two  incoherent  beams  of  polarized  light. 

The  following  eight  terms  now  become  equal  to  zero: 


(47) 


AikA\k  *  0 

A\hAik 

-  0 

AikAut  »  0 

AtkAu 

-  0 

AuAu  ■«  0 

AikAiM 

-  0 

AnAn  **  0 

AikAvt 

-  0 

and  the  coherency  matrix  takes  the  following  form : 

1  AiAi 

0 

0 

Bi5, 

0 

0 

0 

0 

AtAt 

0 

0 

BtBi 

where  A  »  0. 

Taking  the  real  part,  gives 

■Au  Kn 

0  0 

Kn  Bn 

0  0 

0  0 

Kn 

0  0 

Kn  Bn 

(48) 


(49) 


where  A  »  <8*1  Bi,  which  is  an  invariant. 


ANALYZING  POLARIZED  LIGHT 


379 


Thus  it  is  seen  that  the  value  of  the  determinant  of  this  matrix  for  two 
incoherent  beams  of  polarized  light  equals  the  product  of  the  value  of 
the  determinants  of  the  matrices  of  each  beam  of  polarized  light  con¬ 
sidered  separately.  Therefore  the  determinant  of  the  matrix  and  the 
sum  of  its  diagonal  terms  are  invariants  for  this  case. 

Thus  the  problem  of  polarized  liglit  lends  itself  to  analysis  by  three 
distinct  methods.  The  invariants  in  each  case  are  the  same  energy 
functions,  which  is  to  be  expected  from  phjrsical  considerations.  In 
conservative  systems,  no  energy  is  lost  under  a  rotation,  and  therefore 
the  energy  is  invariant. 


A  STUDY  OF  SIXTEEN  COHERENCY  MATRICES 
Bt  Dobotht  W.  Wcskb> 


The  theory  of  the  coherency  matrix  with  its  application  to  a  single 
beam  of  polarised  light  has  been  developed  by  Wiener.*  It  has  been 
extended  to  the  case  of  two  polarised  rays  of  homogeneous  monochro¬ 
matic  light  by  the  author  in  the  paper  preceding  this. 

Let  i'l  and  i'l  represent  the  two  rays. 

When 


+  ivi,  y  -  1,  2 

and 

-  Aye'OX  -h  ^,) 

Hi  -  RyC'Cpf  -f  h)- 

The  coherency  matrix  for  two  coherent  rays  is  given  by 
A.i,  i, B, 

A,  B, 

A,  At  At  B,  At  At  At  Bt  e«*‘-**> 

A,  Bt  B,  Bt  A,  B,  B,  Bt 

The  matrix  for  two  incoherent  rays  of  light  is  similar  to  the  one  for  two 
coherent  rays  except  that  all  terms  are  equal  to  sero  when  the  fa  are  not 
equal.  The  sum  of  the  diagonal  terms  of  the  coherency  matrix  is  an 
invariant.  It  is  equal  to  the  sum  of  the  energies  of  the  several  wave 
components,  which  is  equal  to  the  sum  of  the  energies  of  the 
several  components  plus  the  mutual  energies,  is  also  invariant.  It  is 
defined  as  the  optical  density. 

Sixteen  matrices  representing  different  tjrpee  of  coherent  and  inco¬ 
herent,  polarised  or  unpolarised  light  are  made  the  basis  for  the  follow¬ 
ing  study,  the  energy  of  the  components  being  taken  equal  to  unity. 
The  matrices  /<  represent  incoherent  light,  and  those  called  C<  coherent 
light. 

>  Wilson  Collexe,  Chambersburg,  Pennsylvania. 

*  N.  Wiener,  Coherency  Matrxce*  and  Quantum  Theory.  M.  I.  T.  Journal  of 
Mathematics  and  Physics,  Vol.  VII,  No.  2,  June,  1028. 

380 


STUDY  OF  SIXTEEN  COHERENCY  MATRICES 


381 


The  rulee  of  combination  for  these  matrices  are: 

h  li  •  2  li 
Ci  C<  2  Cl 

It  It  -  /Ta 
Cl  Cf^WTCi 
It  Ci  -  Cl  It. 

The  resulting  matrix  represents  the  combination  of  four  beams  of  light. 
It  is  therefore  to  be  expected  physically  that  Itit  ^  2  It. 


rUNDAMRNTAL  MATRICES 


1 

1 

0 

0 

2 

0 

0 

0 

1 

1 

0 

®  7 

0 

2 

0 

0 

0 

0 

1 

1 

0 

0 

2 

0 

0 

0 

1 

1 

0 

0 

0 

2 

Two  rays  plane  polarised  at  45°.  Two  rays  unpolarised. 


1 

1 

0  0 

2 

0 

0 

0 

1 

1 

0  0  , 

0 

0 

0 

0 

0 

0 

1  -1 

0 

0 

2 

0 

0 

0 

-1  1 

0 

0 

0 

0 

One  ray  plane  polarised  at  45°,  Two  rays  polarised  at  0°. 
the  other  at  135°. 


1 

— t 

0 

0 

2 

0 

0 

0 

i 

1 

0 

0 

T 

0 

2 

0 

0 

0 

0 

1 

—1 

17 

0 

0 

0 

0 

0 

0 

i 

1 

0 

0 

0 

0 

Two  rays  circularly  polarised  One  ray  unpolarised, 
with  same  sense  of  rotation. 


1 

—i 

0 

0 

2 

0 

0 

0 

i 

1 

0 

®  7 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

— » 

1 

0 

0 

0 

2 

Two  rays  circularly  polarised  Two  rays  plane  polarised  in  the 
with  opposite  sense  of  rotation.  opposite  directions. 


382 


DOROTHY  WEEKH 


1 

0 

1 

0 

1 

0 

0 

1 

0 

1 

0 

^  r 

0 

1 

1 

0 

1 

0 

1 

0 

0 

1 

1 

0 

0 

1 

0 

1 

1 

0 

0 

1 

Two  rays  unpolarized  in  rcso-  Two  rays  unpolarized  plane  cou- 
nance.  pling  at  45°. 


1 

0 

1 

0 

1 

0 

0 

-1 

0 

1 

0 

r 

0 

1 

1 

0 

1 

0 

1 

0 

0 

1 

1 

0 

0 

-1 

u 

1 

-1 

0 

0 

1 

Two  rays  unpolarized  one  pair  of  Two  rays  unpolarized  plane  cou- 

components  in  resonance,  others  pling  at  135°. 

interfering. 


1 

0 

— t 

0 

1 

0 

0 

— t 

0 

1 

0 

— t 

n 

0 

1 

—i 

0 

t 

0 

1 

0 

0 

t 

1 

0 

0 

t 

0 

1 

t 

0 

0 

1 

Two  rays  unpolarized  both  pairs  Two  rays  unpolarized  circularly 

of  components  in  same  quadra-  coupled. 

ture. 


1 

0 

—i 

0 

1 

0 

0 

—i 

0 

1 

0 

t 

c, 

0 

1 

i 

0 

i 

0 

1 

0 

0 

— * 

1 

0 

0 

— t 

0 

1 

t 

0 

0 

1 

Two  rays  unpolarized  both  pairs  Two  rays  unpolarized  circularly 
of  components  in  quadrature,  the  coupled  with  opposite  senses, 
quadrature  differing  by  r. 


TABLE  1:  Pkiue  Relation*  for  the  Sixteen  Fundamental  Matricee 


<h  —  ^ 

♦j  — 

c, 

Ct 

ft 

c, 

f  1  +  ■■  Ct 

"  ♦!  +  ■■ 

3r 

3» 

3» 

ft  “  #1  4=  — 

rh-*t  +  - 

♦.-4.+Y 

_  3w 

/* 

Ct 

+  “  Ct 

Zw 

►  I M 


STUDY  OF  SIXTEEN  COHERENCY  MATRICES 


383 


THE  GROUPS  or  TRANSFORMATION  MATRICES 

The  next  step  is  the  determination  of  the  matrices  which  transform 
the  matrix  representing  one  kind  of  light  into  a  different  kind,  that  is,  to 
determine  M  in  the  following  relations 

I,  m:‘ 

/,  -  Cf 

Ci  =  M,  Cf  M-\ 

These  transformation  matrices  divide  into  two  main  types,  the  P  and  the 
4>  types.  To  the  P  type  belong  the  4  and  7  matrices.  To  the  <l>  type 
belong  the  p  and  m. 


The  P  Class 

In  determining  the  matrices  of  this  gnmp  it  was  found  that  a  choice 
from  twenty-four  possible  types  of  matrices  must  be  made,  that  is,  in 
satisfying  the  conditions  imposed  by  the  transformations,  twenty-four 
possible  types  were  involved.  These  twenty-four  formed  a  non-Abelian 
closed  group.  This  group  was  resolved  into  three  groups:  one  each  of 


TABLE  2:  Relations  Between  the  Coherency  and  Transformation  Matrices 


/1.4 

/*-4 

C..4 

/I-4 

d 

■y 

a 

/»-. 

n 

P* 

p* 

Ci  .4 

a 

P* 

$ 

y 

y 

M* 

a 

four,  three,  and  two  elements.  The  three  group  is  irreducible  except  by 
the  use  of  imaginaries.  The  types  in  this  group  were  selected  for  the 
transformation  matrices  and  are  called  0,  5  and  y.  No  other  set  of 
these  matrices  satisfying  the  imposed  conditions  formed  a  closed  group. 
Each  of  the  twenty-four  types  represents  a  group  of  thirty-two  elements. 
These  groups  are  closed  Abelian  groups,  and  each  can  be  resolved  into 
four  groups:  three  groups  of  two  elements  each,  and  one  group  with  four. 


384 


DOROTHY  WEEKS 


TABLE  3:  Typea  of  Tranaformalion  Matrieea 

Reducible  to  Three  Groups:  (1,  3,  3,  4)  (1,  13,  21)  (1,  5) 
Trsnsformation  Matrices  1,  13,  21 


13 


17 


21 


1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

0 

0 

A 

1 

0 

0 

0 

7 

0 

0 

0 

1 

4 

0 

0 

1 

0 

0 

0 

1 

0 

i 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 

1 

0 

0 

0 

7 

0 

0 

0 

1 

8 

0 

a 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

i 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

10 

0 

0 

1 

0 

11 

1 

0 

0 

0 

12 

1 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

14 

0 

1 

0 

0 

0 

0 

1 

0 

15 

0 

0 

1 

0 

0 

0 

0 

1 

16 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

18 

0 

0 

0 

1 

19 

1 

0 

0 

0 

20 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

0 

1 

22 

0 

0 

0 

0 

1 

0 

0 

1 

23 

0 

1 

1 

0 

0 

0 

0 

0 

24 

0 

1 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

The  ^  Claaa 

The  determination  of  the  transformation  matrices  of  this  group  dif¬ 
fered  from  that  of  the  F  groups.  The  p  and  p  types  do  not  form  a 
closed  group.  The  matrices  of  this  class  are  divided  into  three  main 
types,  each  type  having  two  different  forms.  A  matrix  of  the  following 
type 

0 
0 
0 
0 


10  0 
10  0 
0  1  0 
0  1  0 


is  called 


STUDY  OF  SIXTEEN  COHERENCY  MATRICES 


385 


0  0 
i  1 
0  0 
0  0 

This  terminology  is  used  for  the  p  and  p  tjrpes. 

Each  matrix  of  the  p  and  p  groups  is  an  element  of  a  four  group, 
each  group  being  independent.  A  number  of  t3rpe8  of  matrices  satisfied 
the  transformation  conditions.  All  simple  types  satisfied  the  following 
conditions: 


a 

a 

0 

0 

a 

0 

a 

0 

a 

0 

0 

a 

b 

b 

0 

0 

0 

b 

0 

6 

0 

b 

6 

0 

0 

0 

c 

•c 

c 

0 

c 

0 

0 

c 

c 

0 

0 

0 

d 

d 

0 

d 

0 

d 

d 

0 

0 

d 

0  0 
0  0 
1  i 
0  0 


is  called 


CHARACTERIZATION  OF  OPTICAL  INSTRUMENTS  BY  MATRICES 

It  has  been  shown  that  any  one  of  the  sixteen  fundamental  coherency 
matrices  can  be  transformed  into  any  other)  by  the  transformation 
matrices.  Since  each  of  the  coherency  matrices  represents  a  particular 
type  of  light,  then  the  transformation  matrices  characterize  the  optical 
instruments  that  change  one  type  of  light  into  another.  The  instru¬ 
ments  represented  by  the  F  transformation  matrices  are  conservative 
optical  instruments,  in  that  the  power  of  the  input  is  identical  with  the 
power  of  the  output.  These  optical  instruments  form  a  closed  group. 
This  method  thus  affords  a  means  of  studying  the  behaviour  of  the  action 
of  different  types  of  light  under  different  optical  instruments.  From 
Table  3  it  is  readily  seen  that  the  same  instrument  can  be  used  to  trans¬ 
form  several  different  types  of  light.  These  instruments  are  in  general 
not  reversible. 

Michelson’s  Interferometer  utilizes  two  beams  of  light.  If  the  paths 
of  the  emitted  rays  differ  by  even  multiples  of  t,  reinforcement  takes 
place;  if  by  odd  multiples  of  r,  interference.  The  coherency  matrices 
for  these  two  types  of  emitted  rays  are  Ci  and  C*. 

0  0  0  0 

0  10  1 

10  10 
0  0  0  0 


The  matrix 


of  the  form  p* 


386 


DOROTHY  WEEKS 


characterizes  the  action  of  the  interferometer  for  reinforcement,  the 

matrix  0  0  0  0  for  interference. 

0  10-1 
10-10 
0  0  0  0 

Interference  of  two  beams  of  light  may  be  produced  by  the  divided 
lens  method.  The  theory  of  this  instrument  is  similar  to  Michelson’s 
Interferometer  and  is  characterized  by  the  same  transformation  ma¬ 
trices.  This  instrument  can  be  used  in  the  study  of  interference  of 
polarized  light  by  placing  tourmaline  crystals  in  the  paths  of  the  two 
rays.  Tourmaline  transmits  light  in  one  direction  only,  absorbing  the 
light  in  the  other.  Therefore  tourmaline  is  not  a  conservative  instru¬ 
ment.  Interference  does  or  does  not  take  place  according  to  the  position 
of  the  axes  of  the  two  tourmaline  crystals.  If  these  axes  are  parallel, 
interference  occurs;  if  at  right  angles  no  interference  fringes  appear. 
The  matrices  of  the  emitted  rays  are: 

when  interfering  1  0—10 

0  0  0  0 

-10  10 
0  0  0  0 

when  reinforcing  10  10 

0  0  0  0 

10  10 
0  0  0  0 

1  0  0  I 

0  0  0  0 

0  0  0  0  where  x  »  ±  i  or  ±  1. 

i  0  0  1 

Thus  polarized  light  can  easily  be  analyzed  by  the  method  of  coher¬ 
ency  matrices.  Given  a  coherency  matrix  representing  a  ray  of  light, 
the  type  of  light  and  the  relation  of  its  components  can  be  immediately 
stated.  The  transformations  changing  one  type  of  light  into  another 
are  easily  determined.  The  coherency  method  gives  a  simple  way  of 
analyzing  the  many  possible  combinations  of  rays  of  light  and  optical 
instruments. 


when  the  axes  of  the 
tourmaline  crystals  arp 
at  right  angles 


THE  CRITERION  FOR  A  STATIONARY  POINT  OF  ONE  OF  A 
SET  OF  IMPLICIT  FUNCTIONS 

Bt  Prescott  D.  Cboct 

1.  Introduction.  The  purpose  of  this  pap)er  is  to  consider  the 
necessary  and  sufficient  condition  for  a  function 

u  -  u{xi,  art,  .  •  •  ,  Xn), 

where  the  variables  Xi,  x*,  •  •  •  ,  x,  are  restricted  by  the  equations 
0  »  /i(xi,  X,,  • . .  ,  X,) 

0  “  fi{Xlt  •  ■  *  » 


0  -  /«(xi,  X,,  . . .  ,  X,), 

to  be  stationary  at  a  point.  The  criterion  for  a  stationary  point  is 
derived,  and  is  expressed  very  simply  as  a  condition  upon  the  ranks  of 
certain  functional  matrices.  This  condition  may  also  be  expressed  by 
setting  certain  functional  determinants  equal  to  zero,  which  form  is  very 
convenient  for  practical  use.  The  relationship  between  the  above  cri¬ 
terion  and  Lagrange’s  method  of  multipliers  is  brought  out;  furthermore 
in  attaching  a  functional  significance  to  the  determinants  involved, 
expressions  for  i}artial  derivatives  of  u  are  obtained  which  show  the 
connection  between  this  criterion  and  that  obtained  by  setting  i>artial 
derivatives  of  u  with  respect  to  independent  variables  equal  to  aero. 

2.  Systems  Involving  One  Independent  Variable.  By  way  of 
illustration  let  us  first  consider  the  case  where  there  is  but  one  inde¬ 
pendent  variable,  and  but  one  equation  of  condition.  Thus  let  it  be 
required  to  find  the  points  where  the  function 


w  -  /i(x,  y) 

is  stationary  subject  to  the  condition 


Differentiating,  we  have 

0  - 


0  -  /i(x,  y) . 


du  -  dx  4-  dy 
dx  dy  '' 


3/i 


387 


(1) 

(2) 


(3) 


388 


PRESCOTT  D.  CROUT 


a  system  of  homogeneous  linear  algebraic  equations  in  dx  and  dy,  the 
coefficients  being  the  partial  derivatives,  which  take  definite  numerical 
values  at  any  given  point.  Obviously  the  trivial  solution 

dx  ^  dy  mi  0 

is  of  no  importance  here.  For  there  to  be  a  non  lero  solution  to  the 
system  (3),  it  is  necessary  and  sufficient  that  the  determinant 

dx  dy 

dx  dy 

This  is  a  necessary  condition  for  the  existence  of  a  stationary  point; 
thus  if  such  a  point  exists,  equations  (2)  and  (4)  together  determine 
the  corresponding  values  of  x  and  y. 

We  next  consider  the  case  where  there  is  one  independent  variable, 
but  where  there  may  be  any  number  of  equations  of  condition;  thus 
let  it  be  required  to  find  the  stationary  points  of  the  function 

u  -  /i(x,,  X,,  .  • .  ,  x«)  (5) 

subject  to  the  conditions 

0  -  /»(xi,  X,,  . .  •  ,  x«) 

0  -  /i(xi,  xj,  •  • .  ,  X,)  (0) 

0  -  /«(xi,  X,,  . . .  ,  x.) . 

Differentiating,  we  have 


d(x,  y) 


0. 


(4) 


dfn  ,  ,  dfn  ,  . 

—  dx,  +  —  dx,+ 


(7) 


STATIONARY  POINTS  OF  IMPLICIT  FUNCTIONS 


389 


whence  reasoning  as  with  (3)  we  have  as  a  necessary  condition  for  the 
existence  of  a  stationary  point 


$ 


«/*... 

3/i 

dzi 

dz. 

3(/i,  /*,  • 

'  *  -  0 

3/- 

«/• 

"  d(Zi,  z,,  • 

dzi 

dXfi 

exists. 

(8) 

and  (6)  fix 

the  corresponding 

(8) 


X\f  Xf|  •  •  •  f 

3.  The  Necessary  and  Sufficient  Condition  for  the  Existence  of  a 
Stationary  Point.  Having  illustrated  the  method  in  the  preceding 
section,  we  shall  now  examine  the  process  somewhat  more  closely  in  its 
application  to  the  general  case  where  there  are  any  number  of  inde¬ 
pendent  variables,  and  any  number  of  equations  of  condition. 

In  the  first  place  it  is  clear  that  the  direct  treatment  of  first  differ¬ 
entials  in  the  systems  (3)  and  (7)  is  allowable,  for  it  represents  the 
limiting  case  of  vanishing  increments.  For  example,  in  the  case  of  (7) 
we  could  deal  with  the  actual  increments  instead  of  the  differentials  by 
replacing  the  d’s  by  A’s,  and  by  writing  (d/i/dz,  ta)  in  place  of 
dfildXjf  where  <</  vanishes  with  the  increments.  This  requires  that  the 
functions  involved  be  totally  differentiable  at  the  point  in  question. 
The  process  now  goes  on  as  before;  whence  finally  in  passing  to  the 
limit,  the  t’s  vanish  and  leave  (8).  In  what  follows  we  shall  deal  with 
first  differentials  directly,  the  above  limiting  process  being  inferred. 
We  see,  however,  that  it  is  necessary  only  to  postulate  the  total  differ¬ 
entiability  of  the  functions  involved  at  the  point  in  question. 

Let  us  now  investigate  the  condition  for  the  function 


u  »  u(xi,  Zf,  •  •  •  ,  z,)  (9) 

to  have  a  stationary  point  subject  to  the  conditions 
0  -  /i(zi,  Zf,  .  • .  ,  z,) 

0  -  /i(zi,  z,,  . . .  ,  z.)  (10) 

0  «  /m(Xl,  z»,  • .  •  ,  z,) . 


390 


PRE8CX)TT  D.  CROUT 


Differentiating,  it  follows  that 


0  ~  du 


du  ,  ,  j 

-dx,  +  —dT,+ 


0 


dxi 


dx,+^dXi-{- 

OXt 


(11) 


0 


a.  V-rfx 

+  8T.'^- 


the  total  differentiability  of  the  functions  involved  at  the  point  in 
question  being  postulated.  We  next  suppose  that  the  matrix 


(12) 


a/i  a/i 

dfi 

dxi  dxt 

dft  d/. 

a/i 

dxi  dxi 

ax. 

dU  dU 

a/« 

dxi  dX] 

dx. 

is  of  rank  r,  where  r  ^  m;  furthermore  we  can  suppose  without  loss  of 
generality  that  the  order  determinant  formed  by  the  elements  of 
the  upper  left  hand  comer  of  the  matrix  does  not  vanish.  Then  the 
equations 


0 


fl'i 


1.2, 


(13) 


may  be  solved  to  give  dx\,  dxt,  ••  •  ,  dxt  as  linear  combinations  of  the 
remaining  differentials  dzr+i,  dxr^t,  *  ■  ■  .  dx,;  furthermore  any  complete 
set  of  differentials  obtained  by  arbitrarily  choosing  the  last  (n  —  r) 
and  determining  the  first  r  from  (13)  satisfies  the  remaining  equations 


t  -  r  4-  1,  r  +  2, 


(14) 


as  wdl  as  (13). ‘  For  all  such  sets  of  differentials  to  satisfy  also  the 
first  equation  of  (11)  it  is  necessary  and  sufficient  that  the  rank  of  the 
matrix 


'  PaarsI,  "Die  Determinanten,"  p.  197. 


r 


STATIONARY  POINTS  OF  IMPLICIT  FUNCTIONS  391 


du  du 

du  1 

dX|  dX] 

’  ^ 

dfi  3/, 

3/i 

dxi  dxt 

dXn 

dfr  dfr 

dfr 

dxi  dXj 

3x, 

is  also  r,  in  which  case  the  complete  matrix 


du  du 

du  1 

dXi  dxt 

'  ax. 

^  . 

dfi 

dxi  dxt 

■  ax. 

An  A . 

.  Ai 

dxi  dXj 

ax. 

(15) 


(16) 


is  also  of  rank  r.  We  have  thus  proved  the  following  theorem:  Let 
there  be  given  a  Junction  (9)  and  a  set  of  equations  of  condition  (10),  and 
let  the  functions  involved  he  totally  differentiable  at  a  point;  then  the  neces¬ 
sary  and  suffkient  condition  for  the  function  (9)  to  be  stationary  at  that 
point  subject  to  the  conditions  (10)  is  that  the  matrix  (12)  and  the  aug¬ 
mented  matrix  (16)  be  of  the  same  rank. 

Since  the  determinant  of  order  r  in  the  upper  left  hand  comer  of  the 
matrix  (12)  does  not  vanish,  the  first  r  equations  of  (10)  may  be  solved 
expUcitly  to  give  Xi,  x«,  •  •  •  ,  x,  as  functions  of  Xr+i,  x,+t,  •  •  •  ,  x,  in  the 
vicinity  of  the  point  in  question;*  and  the  values  so  obtained  may  then 
be  substituted  in  (9);  hence  in  the  vicinity  of  this  point  u  may  be  ex¬ 
pressed  as  a  function  of  n  —  r  independent  variables.  It  is  thus  evi¬ 
dent  that  a  decrease  of  r  by  one  unit  corresponds  to  an  increase  of  one 
unit  in  the  number  of  independent  variables,  or  degrees  of  freedom; 
furthermore  we  see  that  in  this  connection  we  are  interested  in  the  rank  r 
rather  than  in  the  total  number  of  conditional  equations  m.  In  other 
words  an  equation  of  (10)  which  does  not  contribute  one  unit  to  r 
really  imposes  no  additional  condition  at  that  point. 

It  is  obvious  that  if  other  r***  order  determinants  of  the  matrix  (12) 


*  La  VaII6e  Pousain,  "('ours  d’Analyse  Infinit^simale,"  vol.  1,  p.  130. 


392 


PRESCOTT  D.  CROUT 


differ  from  zero,  then  a  mere  interchange  of  rows  and  columns  would 
bring  such  a  determinant  in  the  upper  left  hand  comer.  In  such  a  case 
the  tame  rettrictiona  would  be  expressed  by  another  set  of  r  functions 
of  (10),  and  another  set  of  (n  —  r)  x’s  would  serve  as  independent 
variables. 

As  an  example,  we  now  see  that  in  the  preceding  section  the  func¬ 
tion  (5)  involves  a  single  independent  variable  in  the  vicinity  of  a  point 
if  and  only  if  the  functional  matrix  of  the  equations  (6)  is  of  rank  (n  —  1) 
at  that  point. 

The  above  criterion  may  also  be  expressed  in  the  following  equivalent 
form:' 


du 

du 

du 

du 

dll 

dxt 

dx. 

dXi 

^  . 

. 

A 

dxi 

dxt 

dXr 

dXi 

djt 

. 

dft 

dft 

dXi 

dXi 

■  dZr 

dXi 

dfr 

dfr 

.  A 

A 

dxi 

dxt 

dx. 

dXi 

As  stated  above,  r  is  the  rank  of  the  noatrix  (12),  the  r"*  order  deter¬ 
minant  in  the  upper  left  hand  comer  of  that  matrix  being  different 
from  zero.  In  any  practical  application  of  the  above  criterion,  the  form 
(17)  would  be  used;  thus  the  quantities  Xi,  Xt,  -  are  determined 

by  the  n  —  r  equations  (17),  and  the  first  r  equations  of  (10). 

In  determining  stationary  points,  r  is  in  general  not  known  at  first, 
for  the  rank  may  change  abmptly  at  certain  points;  hence  r  is  placed 
successively  equal  to  m,  m  —  !,•••,  and  the  criterion  applied  in 
each  case. 

As  illustrations  let  us  finally  consider  the  application  of  the  above 
criterion  to  two  technical  problems. 

1.  In  a  certain  piece  of  electrical  apparatus,  a  transformer  for  exam¬ 
ple,  the  copper  losses  vary  as  the  square  of  the  load  current;  and  the 
iron  losses  vary  as  the  1.6  power  of  the  load  voltage.  Let  it  be  required 
to  find  the  relative  value  of  these  losses  when  the  transformer  is  operat¬ 
ing  at  maximum  efficiency. 


STATIONARY  POINTS  OF  IMPLICIT  FUNCTIONS 


393 


We  are  given 

Wcu  -  ClP 

Wr,  - 

and  it  ia  desired  to  render  the  total  loss 

W  •Wcu  +  Wr, 

a  minimum  for  a  given  volt  ampere  output 

VI  -  c, 

where  Ci,  ct,  and  Ct  are  constants.  We  thus  have 
u  —  CiP  +  ctV‘* 
a  •VI  -  c. 

Referring  to  (17)  we  see  that  r  •  \,n  •  2,  Xi  •  I,  xt  •  F;  hence  (17) 
consists  of  the  single  condition 

2c,/  1.6cF* 

V  I 

which  is  also  identical  with  (4).  Evaluating  we  have 
2c,/*  -  1.6c,V“, 
or 

Wcu  -  .SOW,., 
which  is  the  required  result. 

2.  As  a  second  example  let  us  consider  Maxwell’s  derivation  of  the 
distribution  of  molecular  velocities  in  a  gas.  Denoting  the  three  ve¬ 
locity  components  by  u,  v,  and  w,  let  the  number  of  molecules  having 
an  X  component  of  velocity  lying  between  u  and  u  ^ 

Then  if  the  velocity  distribution  along  one  axis  is  independent  of  veloci¬ 
ties  along  the  other  two  axes,  it  follows  that  the  number  of  molecules 
lying  in  an  element  of  volume  in  the  velocity  space  ia  fiu)f(v)fiu>)dudvdw, 
due  to  symmetry.  But  this  can  depend  only  upon  the  quantity 
(tt*  -f  »*  -b  tc*),  as  there  are  no  preferred  directions.  Our  problem  thus 
becomes  that  of  determining  /  so  that  the  product  /(u)/(i>)/(tc)  is  sta¬ 
tionary  over  the  sphere 


394  PRE8CX)TT  D,  CROUT 

where  A  is  an  arbitrary  constant.  Referring  to  (17)  we  see  that  r 
n  a>  3,  Xi  »  u,  Xt  »  p,  X|  «  w;  and 


I  /'(u)/(p)/(u»)  f{u)f'(v)f(w) 

2u  2v 

I  /'(«)/(»)/  (u»)  /(u)/(»)/'(u>) 

2u  2w 

Evaluating,  we  have 

r(u)  _  m  ,  rM 

vfiu)  "  t/(p)  “ 


0. 


Xi , 


1> 


where  it  is  evident  that  Xt  must  be  a  constant.  Integrating,  it  now 
follows  immediately  that 

/(u)  - 

f(u)fiv)fiw)  -  A  , 


which  is  Maxwell’s  distribution  function. 

4.  Relation  of  the  Above  Criterion  to  Lagrange’s  Method  of  Multi¬ 
pliers.  In  the  preceding  section  the  necessary  and  sufficient  con¬ 
dition  for  a  stationary  point  was  derived..  It  is  the  purpose  of  this 
section  to  show  the  relation  of  the  criterion  there  developed  to  La¬ 
grange’s  method  of  multipliers;*  hence  using  the  method  of  Lagrange 
let  us  again  investigate  the  condition  for  a  stationary  point  of  the  func¬ 
tion  (9)  subject  to  the  equations  of  condition  (10),  the  rank  of  the 
functional  matrix  (12)  being  r.  Here  again  we  suppose  without  loss  of 
generality  that  the  determinant  of  order  r  in  the  upper  left  hand  comer 
of  this  matrix  does  not  vanish  at  the  point  in  question.  Differentiating, 
we  obtain  (11);  whence<in  accordance  with  Lagrange’s  procedure  we 
multiply  the  last  m  equations  of  (11)  by  the  unknown  constants 
Xi,  X|,  •  •  •  ,  X»,  and  add;  thus  we  obtain  the  necessary  condition 


(s;  +  te.  +  '•  te.  +  toi) 


\dxi 


ax»/ 


(18) 


*  La  Vall^  Pouaain,  “Coura  d’ Analyse  Infinit^simale,”  vol.  1,  p.  148. 


STATIONARY  POINTS  OF  IMPLICIT  FUNCTIONS 


395 


Since  the  determinant  of  order  r  in  the  upper  left  hand  comer  of  the 
matrix  (12)  does  not  vanish,  it  follows  that  for  any  given  set  of  values 
of  Xr+i,  Xr-fi,  •  •  •  ,  Xm,  the  coustants  Xi,  Xt,  •  •  •  ,  X,  can  be  determined 
so  that  the  coefficients  of  dxi,  dxt,  •  •  •  ,  dxr  in  (18)  vanish.  Let 
Xi,  Xi,  •  •  •  ,  Xr  be  determined  in  this  manner.  Also,  as  in  section  3, 
we  see  that  since  the  above  mentioned  r***  order  determinant  does  not 
vanish,  the  first  r  equations  of  (10)  may  be  solved  explicitly  to  give 
Xi,  Xt,  •  •  •  ,  Xr  in  terms  of  Xr+i,  Xr+t,  in  the  vicinity  of  the  point 

in  question;  and  these  values  may  be  substituted  in  (9)  to  give  u  as  a 
function  of  Xr+i,  Xr+t,  •  •  ■  ,  x.  in  the  vicinity  of  this  point,  these  variables 
being  independent.  Since  the  values  of  the  corresponding  differentials 
in  (18)  are  arbitrary,  it  follows  that  the  coefficients  of  these  differ¬ 
entials  must  also  vanish;  whence  it  follows  that  a  necessary  condition 
for  a  point  to  be  a  stationary  point  of  (9)  is  that  the  system  of  equations 


^  +  x.2?i  +  x.^  + 

dxi  dxi  dxi 


+  ^ 


dxi 


0 


du 


.  X  .  X 


+  ••• 


dx, 


0 


(19) 


du 

dXn 


+ 


ax« 


+ 


+  • 


dXit 


0 


has  a  solution. 

That  this  is  also  a  sufficient  condition  is  evident  from  the  fact  that 
if  it  is  satisfied,  the  sum  (18)  vanishes  identically,  this  sum  being  du 
in  virtue  of  the  relations 


d/,  -  d/,  -  «  d/«  -  0, 

which  in  turn  follow  from  (10). 

Here  the  partial  derivatives  take  definite  values  corresponding  to  the 
point  in  question;  thus  (19)  is  a  system  of  linear  equations  having 
Xi,  X|,  •  •  •  ,  Xm  as  unknowns.  But  the  necessary  and  sufficient  condi¬ 
tion  for  (19)  to  have  a  solution  is  that  the  matrix  (12)  and  the  augmented 
matrix  (16)  be  of  the  same  rank,  which  is  the  criterion  developed  in 
Section  3.  In  the  present  development,  however,  this  criterion  is  merely 
the  necessary  and  sufficient  condition  that  the  equations  (19)  have  a 
solution,  which  fact  is  in  itself  a  necessary  and  sufficient  condition  for 
a  stationary  point.  In  the  development  of  Section  3  this  criterion  was 
shown  to  be  a  necessary  and  sufficient  condition  for  a  stationary  point. 


396 


PRESCOTT  D.  GROUT 


It  may  be  noted  that  since  the  matrix  (12)  is  of  rank  r,  the  determinant 
in  the  upper  left  hand  comer  being  different  from  lero,  values  may  be 
arbitrarily  assigned  to  X,+i,  K+t,  •••  ,  then  these  values  together 
with  the  values  of  Xi,  Xf,  •  •  •  ,  Xr  determined  from  the  first  r  equations 
of  (19)  will  be  a  solution  of  (19).  We  thus  see  that  this  system  of 
equations  has  an  {m  —  r)  order  of  infinity  of  solutions. 

It  is  clear  that  an  equation  of  (10)  which  does  not  contribute  one 
unit  to  r,  and  which,  therefore,  as  we  saw  in  Section  3,  imposes  no  addi¬ 
tional  restriction  at  the  point  in  question,  merely  gives  rise  to  another 
arbitrary  X.  Since  the  constants  Xr.t.it  *  *  *  «  are  arbitrary,  we 
may  set  them  all  equal  to  zero;  whence  the  system  (19)  becomes 


dx,  ^  dxt  dxi  ^  dXi 


+  K 


dx. 


0 


(20) 


dx. 


9/t 


-f-  •  •  •  + 


dx. 


0. 


Here  we  see  that  equations  of  (10)  which  do  not  contribute  to  r,  and 
thus  impose  no  additional  restriction,  automatically  drop  from  con¬ 
sideration.  It  is  obvious  that  if  other  r***  order  determinants  of  the 
matrix  (12)  differ  from  zero,  then  by  interchanging  rows  and  columns 
other  such  determinants  may  be  brought  into  the  upper  left  hand 
comer;  in  which  case  the  tame  restrictiont  could  be  expressed  by  another 
set  of  r  functions  of  (10)/ and  another  set  of  (n  —  r)  x’s  could  be  taken 
as  independent  variables. 

5.  Relation  between  the  Functional  Matrix  and  Partial  Derivatiyes 
with  Respect  to  Independent  Variables. 

In  previous  sections  we  have  considered  the  ranks  of  certain  matrices, 
or  have  placed  certain  functional  determinants  equal  to  zero  in  order 
to  decide  whether  a  given  point  is  stationary,  or  in  order  to  single  out 
stationary  points.  No  other  significance  was  given  to  these  determi¬ 
nants;  and  no  attempt  was  made  to  consider  their  meaning  as  functions. 
This  question  will  now  be  considered. 

Let  there  be  given  the  function  (9),  and  the  equations  of  condi¬ 
tion  (10).  Here  again  let  the  rank  of  the  matrix  (12)  at  a  given  point 


STATIONARY  POINTS  OF  IMPLICIT  FUNCTIONS 


397 


be  r,  the  r***  order  determinant  in  the  upper  left  hand  comer  of  this 
matrix  being  nonvaniahing;  then  as  in  Section  3  we  see  that  u  may  be 
expressed  in  terms  of  n  —  r  independent  variables  in  the  vicinity  of 
this  point.  Let  these  variables  be 


ti  -  ^(Xi,  Xt,  •  •  •  ,  X,) 

I*  “  ^l(Xl,  Xj,  •  •  •  ,  Xn)  (21) 


tn—r  *  ^«_r(Xi,  Xf,  •  •  *  |  X*)  J 


hence  differentiating  (9),  (10),  and  (21),  we  have 


du 

0 


dt,-^dz,  +  ^d^+...+^h. 


dl. 


dxi  dxi 


dxt  + 


dx. 


dx. 


a  system  of  (1  +  m  +  n  —  r)  linear  equations  in  the  n  differentials 
dzi,  dxf,  •  •  •  ,  dxn.  Since  the  rank  of  the  matrix  (12)  iar,  and  since  the 
upper  left  hand  determinant  of  order  r  in  this  matrix  is  not  zero,  we  may 
omit  from  (22)  the  equations  involving  the  functions /r+i,/r+tf  *  ■  *  tfm] 
furthermore  the  rank  of  the  functional  matrix  of  the  equations  of  (22) 
involving  /i,  /i,  •  •  •  ,  /r,  •  •  •  ,  r  is  n,  for  since  Ii,  <*,•••, 

are  independent  variables,  the  differentials  dii,  dlt,  •  •  •  ,  may 
take  arbitrary  values.  These  n  equations  thus  fix  the  values  of  dxi, 
dxt,  •  •  •  ,  dxa  corresponding  to  a  given  set  of  differentials  dti,  dtt,  •  •  •  , 
dtn-r.  For  the  values  of  dxi,  dxt,  •  •  •  ,  dxn  so  determined  to  be  consist¬ 
ent  with  the  first  equation  of  (22),  it  is  necessary  and  sufficient  that 


398 


PREHCOTT  D.  CROUT 


du 

du 

du 

du 

dzi 

dr. 

dx« 

0 

a/i 

a/i 

dXi 

dzt 

dx. 

0 

dfr 

dfr 

dxi 

dxt 

dXn 

A. 

3^1 

3^1 

d^i 

dxi 

dXt 

dXn 

dt 

d^^r 

dxi 

dXt 

dx. 

0; 


(23) 


whence  developing  by  minors  with  respect  to  the  first  column,  it  follows 
that 


^(^It  *  •  •  »  ^I»— rt  fit  *  *  *  »/y) 


du 


(24) 


,  /_  |\j  9iu,  4^1,  •  »  •  /i,  «  »  »  ,/r)  ^  Q 

“  d(3^i*  •  »  *•) 

It  is  thus  evident  that 

3(^l>  •  •  •  >  ^>l>  •  •  •  >  •  •  •  t  /f) 

^  _ a( Jl,  Xt,  ,Xn) _ 

af«  a(^i,  •  »  •  ,  •  »  »  ,/r) 

,  d(Xi,  Xf,  *  •  '  ,  X») 

We  have  just  seen  that  the  denominator  of  this  fraction  is  not  zero; 
hence  expanding  the  numerator  by  Laplace’s  development  with  respect 
to  the  two  groups  of  functions  (^,  •  •  •  ,  ^«_r)  and  (u,  /i,  •  •  •  ,/,),  we 
see  that  the  criterion  of  Section  3  requires  that 


(25) 


^-0,  .  -1,2,  (26) 

and  vice  versa;  hence  that  criterion  and  the  equations  (26)  are  equiva¬ 
lent,  as  was  to  be  expected  since  the  variables  h,  tt,  •  •  •  ,tn-r  are 
independent. 


THE  THEORY  OF  LINEAR  MATRIX  TRANSFORMATIONS 
WITH  APPLICATIONS  TO  THE  THEORY  OF  LINEAR 
MATRIX  EQUATIONS 

Bt  Lawhxncc  Harbih* 

The  object  of  this  essay  is  to  develop  the  theory  of  linear  transforma* 
tions  of  matrices  and  the  applications  of  this  theory  to  the  solution  of 
systems  of  linear  matrix  equations. 

By  a  linear  transformation  of  matrices  is  meant  a  relation,  such,  that 
for  a  given  set  of  matrices  Xi  •  •  •  x*  a  new  set  of  matrices  yi,  ■  *  *  ,  is 
determined  uniquely  by  the  equations 

(1)  y<  -  Z)  (»  -  1, 2,  ...  m) 

j-i 

where  a,,  is  a  matrix.  The  theory  developed  in  this  essay  will  con¬ 
sider  only  the  case  where  yi,  Xy,  Otj  are  square  matrices  of  the  n***  order. 
Since  in  general  matrices  are  not  commutative  it  will  be  necessary  to 
consider  three  other  linear  transformations;  namely 


(2) 

Vi  - 

OjiXy 

j-i 

(3) 

Vi  - 

m 

(4) 

Vi  « 

£  ^i°ii 

i-i 

The  following  conventions  will  be  adopted  in  using  matrices: 

1.  The  matrix  Oi,  will  have  the  element  aj[  ^  in  the  X  row  and  m  column. 

II.  The  matrix  Xi  will  have  the  element  in  the  X  row  and  n  column. 

III.  The  matrix  yi  will  have  the  element  in  the  X  row  and  yi  column. 

IV.  The  transpose  matrix  of  Oi,  will  be  written  g,y  and  will  have 
element  a  *  (  in  the  X  row  and  yt  column.  It  follows  immediately, 
if  ail  ^  element  X  row  and  yt  column  of  g^y,  that  g^  a' j^. 

*  After  having  been  started  at  the  Massachusetts  Institute  of  Technology  in 
1931,  this  work  was  subsequently  presented  in  partial  fulfillment  for  the  degree  of 
Master  of  .\rts  at  Columbia  University  in  1933. 

The  author  at  this  time  would  like  to  acknowledge  his  indebtedness  and  thank 
Prof.  F.  L.  Hitchcock  of  the  M.I.T.  for  his  great  interest  and  able  assistance 
during  the  early  days  of  this  work  and  Prof.  W.  B.  Fite  of  Columbia  University 
who  kindly  read  and  corrected  the  whole  Manuscript. 

399 


400 


LAWRENCE  HARRIS 


Ordinary  matrices  will  be  represented  by  use  of  single  bars,  while 
their  determinants  will  be  represented  in  the  following  manner 

det  I  oy  I 

means  determinant  of  ordinary  matrix  whose  element  in  t  row  and  j 
column  is  ay. 

The  following  definitions  will  now  be  given : 

Definition  I.  The  square  array. 

On  •  •  •  OiM 

(5)  a  -  i  : 

Anal  •  •  “ 

is  defined  as  double  matrix  of  transformation  (1). 

Definition  II.  The  associate  matrix,  obtained  by  replacing  oa  by 
its  value  |  |  in  (5)  where,  i,j  >■  1,2  •  •  •  m;  X,  m  ~  1|  2,  •  •  •n,will 

be  known  as  expandant  of  9.  It  will  be  written 


(6) 


m) 


It  is  a  matrix  of  m.n  order. 

Definition  III.  The  determinant  of  a  double  matrix  is  defined  to  be 
the  determinant  of  its  expandant.  It  will  be  written 


(7) 


A(a)  -  det  Em . 


Definition  IV.  Two  double  matrices  are  said  to  be  equal  when  and 
only  when  they  have  the  same  number  of  columns  (or  rows),  and  every 
element  of  one  is  equal  to  the  corresponding  element  of  the  other. 
Definition  V.  The  double  matrix 


LINEAR  MATRIX  TRANSFORMATIONS 


401 


is  known  as  the  conjugate  double  matrix  to  9  or  simply  as  the  conjugant 
to  a. 

Definition  VI.  The  double  matrix 


(9) 


Oil  . 

•  Qim 

0.1  • 

•  Qmm 

is  known  as  the  commuted  double  matrix  to  9  or  simply  as  the  corn- 
mutant  of  9. 

By  definition  it  is  seen  that  the  double  matrix  (5)  is  associated  with 
equation  (1).  In  the  same  way  it  will  be  shown  that  the  conjugant  and 
commutant  and  the  conjugant  of  the  commutant  of  9  are  associated 
with  equations  (2),  (3)  and  (4).  In  equation  (2)  let  be  replaced  by 
in  where  a./  »  d,.  then 


I  -  1,  2,  ...  m. 

1-1 

But  by  definition  the  double  matrix  of  this  system  is 

<*11  •  •  •  <*i»i 

:  : 

•  ^mm 


therefore  the  conjugant  of  9  is  associated  with  equation  (2)  and  is  said 
to  be  the  double  matrix  of  this  system. 

In  equation  (3)  notice  that  each  term  of  the  sum  is  merely  the  product 
of  two  ordinary  matrices  of  the  n***  order.  Now,  since  the  transpose  of  a 
product  is  the  product  of  the  transposes  in  the  reverse  order,  and  the 
transpose  of  a  sum  is  the  sum’  of  the  transpose  it  follows  immediately 
that 


(10) 


i  «  1,  2  ...  m 


where  yt  is  transpose  of  y<,  is  transpose  of  x<  and  is  transpose 
(conjugate)  of  a,y.  But,  the  double  matrix  of  (10)  is 


9ii  • 

•  9ii» 

a^\  • 

*  Qmm 

402 


LAWRENCE  HARRIS 


hence  the  double  matrix  associated  with  (10)  is  the  commutant  of  9 
and,  therefore,  it  will  be  said  that  the  double  matrix  associated  with  (3) 
is  simply  the  commutant  of  9. 

If  Oii  is  replaced  by  dn  in  (4)  where  Oa  ~  d^  then  (4)  becomes 

(11) 

j-i 

which  has  as  its  double  matrix 


dll  • 

••  di- 

On  • 

’  0,1 

• 

•  ^mm 

Oi.,  • 

since  equation  (11)  is  of  same  form  as  (3)  and  since  Oi,  «  d^  it  follows 
immediately  that  Oi,  da. 

Definition  VII.  Let  S  and  $  be  two  double  matrices  of  the  same 
order;  Le.,  having  the  same  number  of  rows  and  columns,  each  of  whose 
elements  are  square  matrices  of  the  same  order.  If  oa  is  element  in 
the  i  row  and  j  column  of  9  and  ba  is  element  in  the  t  row  and  j  column 
of  ©  then  the  element  in  the  t***  row  and  j**'  column  of  the  product 
is  c</  »  S*-i  9©  »  (S  -  II  c«  II . 

As  in  the  case  of  ordinary  matrices  it  is  to  be  noted  that  product  of 
two  double  matrices  is  a  noncommutative  operation;  that  is,  in  general, 

9©  ©a . 

Theorem  I .  The  multiplication  of  double  matrices  is  associative. 

Let  a,  ©  and  d  be  double  matrices  then 

a©  »  ||d<i||  where  di,  »  X 

•  *-i 

(a©)(S  -  ||/<,  ||  where  /«  «  ]£  but 

m 

since  dr,  ^  ^  Orihu  it  follows  that 

»-i 

m  m 

/«/»■£  2  fhkbklCij 
<-l  *-l 

m 

also  ©(S  «  ||(i{^||  where  ga  «  ^  buCki  and 

a(©C)  -  llAoll  where  -  f)  Oai)*, 


r 


LINEAR  MATRIX  TRANSFORMATIONS  403  j 

m 

since  gr$  «  ^  KtCu  it  follows  that 

1-1  V 

m  m 

hij  “  ^  ^  OiibkiCij 
<-1  *-l 

and  therefore 

bii  ■=  fa 

and  finally 

(«5a)e  -  «(«6) 

therefore  multiplication  of  double  matrices  is  associative. 

The  symbol  will  mean  the  element  in  the  i  row  and  j  column 
of  double  matrix.  91. 

It  has  been  shown  above  that  each  matrix  transformation  (1),  (2), 

(3)  and  (4)  has  associated  with  in  one  of  the  possible  forms  of  double 
matrix.  Therefore  in  the  sequel  (1)  will  be  called  the  ordinary  matrix 
transformation;  (2)  the  conjugate  matrix  equation;  (3)  the  conunuted 
matrix  equation  and  (4)  the  conjugate  commuted  matrix  equation. 

Before  proceeding  further  with  the  study  of  linear  matrix  equations 
it  will  Ik*  necessary  to  prove  the  following  theorems. 

Theorem  II.  The  commutant  of  the  product  of  the  conjugants  of 
two  double  matrices  is  equal  to  the  conjugant  of  the  product  of  the 
commutants  taken  in  the  reverse  order.  That  is 


Let 

=  mi 

ij  —  2J  ^ikbkf 
k-l 

where 

iii  =  (t)« 

-  O,.  -  Wh 

therefore 

K  -  mu 

-  bii  =  («),7 

Ci)  “  23 

t-1 


C,7  =  2Z  H  hkQki  • 

*-l  k-l 


and 


4(M 


LAWRENCE  HARRIS 


When*  the  last  step  was  obtained  fn)m  the  fact  that  the  transix)8e  of 
the  product  of  ordinary  matric«*s  is  product  of  transposes  taken  in  reverse 
order. 


Now  let 


=  mu  “  Z 

*-i 


then 


da 


kik^ki 

*-  I 


but 

therefore 
but  also 
and 

therefore 


C«i  “  ^ikQn 

*-i 

dij  ■■  Ciy 
Cw  -  (^)iy 
dii  -  (««)y 


in  virtue  of  definition  IV. 

Theorem  III.  The  commutant  of  the  conjuKante  of  a  double  matrix 
is  equal  to  the  conjugant  of  the  commutant,  that  is 


(«)  =  (jO  . 


The  proof  of  this  theorem  is  obvious  from  the  definitions  of  com¬ 
mutant,  conjugant,  and  expandant,  for  inspection  will  show  that  the 
expandant  of  the  commutant  of  the  conjugant  is  equal  to  the  conjugant 
of  the  commutant. 

By  the  use  of  Theorems  II  and  III  it  is  now  possible  to  prove  the 
following  theorem  which  is  a  generalization  of  Theorem  II. 

Theorem  IV.  The  commutant  of  the  product  of  the  conjugants  of  a 
sequence  of  double  matrices  is  equal  to  the  conjugant  of  the  product  of 
the  commutants  taken  in  the  reverse  order,  that  is 

jii  » . .  g,  -  ?li. 


It  will  be  prdven  that  if  this  theorem  is  true  for  n  —  1  double  matrices 
it  is  true  for  n  double  matrices  and  hence  since  it  is  true  for  /i  =  2  it  is 
true  for  all  values  of  n.  Let 


p  »  Si  •  •  •  8,-1 
q  -  a»-i  ...  Si 


LINEAR  MATRIX  TRANSFORMATIONS 


405 


then  if  theorem  is  true  forn  —  1 


P  »  9 

or  P  =  5 

and  p3,  =  =  flnQ 

where  the  last  step  was  obtained  from  Theorem  II  by  allowing  S  =  9 
and  »  31,;  hence 

pjl,  a*  31,7 

in  virtue  of  Theorem  III.  Finally 

=  31,  31.-,  ...  31, 


which  proves  the  theorem. 

Return  now  to  the  transformations  defined  by  equations  (1),  (2),  (3) 
and  (4)  and  introduce  the  following  double  matrices  of  m*’’  order. 


Xn  0  •  • 

•  0 

Xn 

ImP  •  • 

.  0 

Xml 

Vii  0  • 

..  0 

yn 

y-iO  • 

•  •  0 

Vml 

then  in  the  usual  manner 


Xll  • 

.  0 

Xu 

Xn  • 

•  •  ^m\ 

..  0 

Xml 

; 

0  .. 

.  0 

^  »  II  Xii  . .  •  x«i  II 


r  - 

yn  ■ 

..  0 

I/ll 

;  Y  = 

l/n  •  •  •  Vml 

ymi  • 

..  0 

ymi 

— 

II 

0  ...  0 

ymi  II 

=  lUi,  • . .  x«i  II  ; 


=  II  yii  •  •  •  Vmi  II ; 


406 


LAWRENCE  HARRIS 


If  this  is  done  equation  (1)  may  be  written  in  form 

r  -  ax 

m 

for  (a  X)ii  ^  Oik  Xkj.  But  Xki  =  0  if  j  ^  1 

t-i 

m 

and  therefore  (aX)^  =  22 

k-l 

but  (aX)a  -  ya  -  y. 

m 

therefore  Vi  “  22 

*-i 

which  is  previously  equation  (1).  This  process  may  be  also  applied 
to  the  remaining  equations  with  the  result  that 


(12a) 

:  Y  =  ax 

(12b) 

m 

y.-  Z 

OjiXj  \ 

r  -  ax 

(12  c) 

m 

yi »  Z 

t-i 

Xflii 

:  f  -  Xa 

(12d) 

y.  -  Z 

*-i 

Xflii  : 

M 

It  is  to  be  noted  that  (12  c)  and  (12  d)  may  be  written  in  alternate  form 
corresponding  to  equations  (10)  and  (11).  Consider  (12  c) 

f  =  :^a 

or 

f«r  =  M^.ax 

and 

(12'c)  -  £  «•»?»  ;  Y  - 

which  corresponds  to  (10).  Consider  now  (12  d) 

f  -  xa 


LINEAR  MATRIX  TRANSFORMATIONS 


407 


or 

and 

(12'd)  Vi  -  E  :  r  -  lA’ 

which  corresponds  to  (11)  if  the  transpose  of  each  side  of  (11)  is  taken. 
It  is  obvious  why  the  association  made  earlier  was  used. 

By  allowing  Oi/  «  bn  and  introducing  the  double  matrix 


equation  (12  b)  can  be  written  in  the  form 
(12"  b)  Vi  -  E  biiXi  :  K  » 

and  since  Qa  =  bn  =  S  equation  (12'd)  can  also  be  written  in 

the  form 

(12"  d)  Vi-t,  hafi  :  r  - 

;-l 

hence  (12  a)  and  (12"  b)  are  of  same  form  and  also  (12'  c)  and  (12"  d) 
are  of  same  form  therefore  it  is  only  necessary  to  discuss  equations  of 
form  (12  a)  and  (12'  c).  To  distinguish  between  these  two  forms  the 
following  definition  will  be  made. 

Definition  VIII.  A  system  of  linear  matrix  equations  of  form 

-  E  :  y  -  ax 

will  be  said  to  be  a  system  of  prefactor  matrix  equations;  while  a  system 
of  form 

=  E  :  I"  -  Xa 

/-I 

will  be  said  to  be  a  system  of  postfactor  matrix  equations. 


408 


LAWRENCE  HARRIS 


It  is  to  be  noted  that  a  linear  matrix  equation  of  form 

yt  •  QiiXi  :  r  - 

is  also  a  postfactor  matrix  equation. 

A  system  of  linear  matrix  equations  may  be  considered  as  a  linear 
transformation  carrying  one  set  of  matrices  into  another.  Consider 
first  a  set  of  prefactor  matrix  equations. 

-  E  :  X' 

this  may  be  considered  a  linear  transformation  carrying  the  variables 
(xi,  •  •  •  ,  Xm)  into  a  new  set  of  variables  {x[,  •  •  •  ,  x«)  or  in  virtue  of 
the  second  relation,  which  is  equivalent,  91  may  be  considered  as  a 
transformation  changing  the  variable  from  X  to  X'  or,  since  X  and  X' 
are  special  forms  of  a  double  matrix,  changing  the  double  matrix  X  to  X\ 
Now  consider  a  set  of  postfactor  matrix  equations. 

x'  -  E  xfia  : 

j-i 

this  may  be  considered  as  a  linear  transformation  carrying  the  variables 
(xi,  •  •  •  ,  X*)  into  a  new  set  of  variables  (x  i ,  •  •  •  ,  x*)  or  because  of 
second  relation,  which  is  equivalent,  H  may  be  considered  as  a  trans¬ 
formation  changing  the  variable  from  X  to  X'  or,  since  X  and  X'  are 
special  forms  of  a  double  matrix,  changing  the  double  matrix  X  to  X\ 
Because  X  and  X'  are  vertical  double  matrices  of  one  column  and  X' 
and  X  are  horizontal  matrices  of  one  row,  it  is  obvious  that  X*  »  SIX 
and  X'  =»  XS  are  essentially  different  types  of  transformations,  for,  the 
former  is  a  transformation  between  vertical  matrices  while  the  latter  is 
between  horizontal  matrices. 

Suppose  a  set  of  variable  matrices  X'  are  introduced  as  linear  functions 
of  a  set  of  variable  matrices  X  and  then  a  linear  transformation  is 
introduced  which  changes  the  variables  X'  to  a  new  set  X”.  It  is 
evident  that  by  combining  these  two  transformations  the  resultant 
transformation  will  carry  X  directly  to  X".  In  order  to  completely 
discuss  this  composition  of  double  matrices  the  following  theorem  will  be 
proved. 

Theorem  V.  If  we  pass  from  the  variables  Xi  •  •  •  x«  to  the  variables 
x\  •  •  •  x'm  by  a  linear  transformation  of  double  matrix  91  and  from  the 


I 


LINEAR  MATRIX  TRANSFORMATIONS 


409 


variables  x[  •  •  •  x.  to  the  variables  x\  •  •  •  x'  by  another  linear  trans¬ 
formation  of  same  form  as  the  first  and  of  matrix  then  the  linear 
transformation  of  double  matrix  will 

1.  Carry  us  directly  from  the  variables  Xi  •  •  •  x„  to  the  variables 

9  » 

Xi  x». 

2.  Be  of  same  form  as  ^  or 

It  is  to  be  noticed  that  this  theorem  only  applies  to  transformations 
of  same  form,  that  is,  if  the  transformation  carrying  us  from  Xi  to  x^  is 
a  prefactor  (or  postfactor)  transformation,  then  in  order  that  this 
theorem  apply  the  transformation  carrying  us  from  x]  to  x^  must  also 
be  a  prefactor  (or  postfactor)  matrix  transformation.  The  proof  of 
this  theorem  will  now  l)e  given  for  each  case, 
a)  Prefactor  Transformation: 

Let  linear  traasformation  carrying  Xi  •  •  •  x»  to  x,  •  •  •  x^  and  of 
matrix  l)e 

x)  -  Z  :  X'  -  ?IX 

i-l 

and  linear  transformation  carrying  x|  •  •  •  xl  to  x'  •  •  •  xi|[  and  of 
matrix  be 

»  Z  ‘  X"  -  5BX' 

Now  if  the  value  of  xj  is  replaced  in  second  transformation  by  value 
given  by  first  transformation  then 

=  Z  Z  -  z  z  w. 

1-1  i-l  i-l  i-l 

Now  let  0.  be  double  matrix  of  transformation  carrying  Xi  •  •  •  x« 
directly  into  x,  •  •  •  x„  then 

(C)*i  *  Z  -  Cki 

i-l 

but  we  had  previously  that 

m)H  -  z  K»ii 

from  Definition  VTI  therefore 

((S)*i  -  (50«)*. 


\ 


410 


LAWRENCE  HARRIS 


and  from  Definition  IV  it  follows  that 

C  - 


therefore 

x'i  -  Z  -  CX 

•  -1 

where 

Cii  -  E  *»««/.  :  C  - 
<-< 

which  proves  theorem  in  case  of  prefactor  transformation, 
b)  Postfactor  Transformation. 

Let  postfactor  transformation  carrying  Xi  •  •  •  td  xj  •  •  •  x*  and  of 
matrix  ^  be 


-  E  XiOii  :  X'  ~  Xfl 

and  postfactor  transformation  carrying  x\  •  •  •  x^  to  x','  •  •  •  x^  and  of 
matrix  $  be 


“  Z  bki  :  X"  -  X' « 

now  if  the  value  of  x]  is  replaced  in  second  transformation  by  its  value 
given  in  first  transformation  then 


,  /-I  i-i 

and  taking  the  transpose  of  each  side 


?*  ”  53  ^3 

;-l  1-1 

and  interchanging  the  summation  signs 


?*  “  ^3  ^3 

<-i  1-1 


It  is  important  to  note  at  this  point  that  the  matrix  associated  with  the 
postfactor  matrix  equation  is  made  from  equation  10  or  12'c  and  not 


LINEAR  MATRIX  TRANSFORMATIONS  411 

from  equation  3  or  12c.  Now  since  and  are  ordinary  matrices 
of  the  order  the  last  equation  can  be  written  on  replacing 

m 

2  “  C« 

i-i 

as 

m 

i-l 

which  is  of  same  form  as  previous  equations.  Now  let 

Cki  =  (§)*» 

that  is  $-  is  matrix  associated  with  last  equation  hut 

C«  -  2 

in  virtue  of  Definitions  VI  and  VII,  therefore 

(«)*.  -  (»?)*.• 

and  from  Definition  IV  it  follows  that 

and  2^''  =  or  finally 

which  proves  theorem  for  postfactor  transformation.  Df  course  it  is 
obvious  that 

j?"  =  xm 

or 

X"  -  X^ 

but  it  is  again  important  to  notice  that  this  form  does  not  define  the  tjrpe 
of  matrix  associated  with  a  postfactor  transformation. 

It  follows  from  Theorem  V  that  the  formal  symbolism  of  groups 
theory  may  be  applied  to  matrix  transformations  under  the  additional 
restriction  that  the  transformations  be  of  same  kind.  For  if 


then 


X'  -  flX;  X"  -  ©X 

X"  -  ©(ax)  -  (©a)x 


412 


LAWRENCE  HARRIS 


and  if 

X'  -  m  or  X'  -  X"  -  or  -  SX' 

then 

X"  -  (X«)®  -  X(«!S) 

or 

X"  -  ®(?X)  =  («a)X 

but  if 

X'  »  ?IX  and  X"  =  X'^ 
it  does  not  follow  that 

X"  -  ax« 

for  as  yet  ?IX^  is  undefined.  This  is  the  reason  we  are  limited  to 
transformations  of  same  kind.  It  is  important  to  observe  at  this  {X)int 
that  if  an  identity  double  matrix  and  an  inverse  double  matrix  can  be 
defined  it  will  be  possible  to  prove  the  theorem  that  all  double  matrices 
form  a  group.  The  identity  double  matrix  will  l)c  defined  in  the  follow¬ 
ing  manner. 

Definition  IX.  The  special  linear  transformation 

x<  =  (t  *  1,  . . .  ,  m) 

whose  double  matrix  is  3  where 

(3)«  =  In 

where  /i,  is  an  ordinary  matrix  of  n  rows  and  columns  which  is  equal  to 
the  ordinary  identity  matrix  when  t  >■  j  and  is  equal  to  the  ordinary 
zero  matrix  when  t  ^  j  is  called  the  identity  transformation  and  3  is 
called  the  identity  double  matrix. 

Theorem  VI.  If  is  any  double  matrix  and  3  u  the  identity  double 
matrix  then 

a3  -  3?(  =  « 


(%3)a  =  21 


For  by  Definition  VII 


LINEAR  MATRIX  TRANSFORMATIONS 


413 


* 


since  /,*  =  76,*  where  6,*  is  the  Kronicker  delta  and  I  is  ordinary  iden¬ 
tity  matrix.  Also 

j-i 

therefore 


ao  «  39  -  a 


in  virtue  of  Definition  IV’  which  proves  theorem. 

Before  defining  the  inverse  to  a  double  matrix  it  will  be  necessary  to 
define  what  is  meant  by  the  product  of  a  double  matrix  and  a  scalar. 
Scalars  will  be  denoted  by  lower  case  Greek  letters.  The  definition 
follows. 

Dffinition  X.  The  product  of  a  scalar,  a,  and  a  double  matrix,  9,  has 
the  following  properties. 

(a9),,  -  (9a),,  -  a(9),y 


This  definition  implies  that  the  multiplication  of  a  double  matrix  by  a 
scalar  is  commutative,  and  each  clement  of  the  double  matrix  is  multi¬ 
plied  by  the  scalar. 

h^rlier  in  this  work  the  determinant  of  a  double  matrix  had  been 
defined  as  the  determinant  of  the  expandant  of  the  double  matrix. 
(Definitions  II  and  III.)  The  following  theorem  follows  immediately 
from  this  definition. 

Theorem  VII.  The  determinant  of  the  commuted  conjugate  of  a 
double  matrix  is  equal  to  the  determinant  of  the  double  matrix  and  the 
determinant  of  the  commutant  is  equal  to  the  determinant  of  the 
conjugant. 

This  follows  immediately  for  by  definition 


a 


1 1 
1 1 


ali 


In. 

“l  1 


a 


Im 

1  a 


a 


1  1 
a  1 


,Iat 

*al 


a 


lai 
a  a 


a:(9)  = 


o"!  •••«"! . oTT  •  •  •  «7r 

<*:}•••  a:i . <*:?  •  •  • 


k 


414 


LAWRENCE  HARRIS 


(^fi) 


1 1 

1 

1 1  * 

'•  O.,  ... 

...a,, 

1 1 

1  n  * 

•oii  ... 

•••a7i 

1 1  • 

•  a.,  . . . 

...a,, 

air 


®  I  »  *  “  *  ^mm 


and  in8prctiun  will  show  that  E{fi)  is  the  ordinary  transpose  of  E{%) 
hence  they  have  the  same  determinant  Ixicause  in  ordinary  matrix  the 
rows  and  columns  can  l)e  inU'rchanged  without  changinK  value  of 
determinant  hence  A(S1)  »  det  ”  ^(ll)  *  det  To  prove 

last  part  of  theorem  let 

H  -  $  then  «  »  ^ 
but 

A{9)  -  A(®) 

ther»'f<m? 

Am  -  A(«) 


The  following  definition  will  now  be  made. 

Definitum  XI.  A  double  matrix  of  determinant  xero  is  said  to  be 
singular.  The  transfonnation  defining  such  a  double  matrix  is  said  to 
be  singular. 

For  the  ri'mainder  of  this  essay  it  will  be  assumed  that  all  transforma¬ 
tions  are  non-singular,  that  is,  they  have  determinant  unequal  to  zero. 

R<‘fore  defining  the  inverse  of  a  double  matrix  the  following  definition 
will  l>e  given.  » 

DefinilUm  XII.  The  ordinary  matrix 


^  1 1 


A\i  ‘A 


i,j  ^  1,  ...  ,  m 


where  A  ][  i  is  the  ordinary  oofactor  of  al  {  in  the  expandant  of  9  is  said 
to  be  the  double  cofactor  of  in  while  the  double  matrix  9  where 


(®)»/  " 


is  the  double  adjoint  of  the  double  matrix  91. 


LINEAR  MATRIX  TRANSI-X^RMATIONS 


415 


Theorem  VIII.  If  9  is  a  double  matrix  and  9  is  its  adjoint  then 


For 


where 


Now  let 


therefore 


aa  »  - 

3A(a) 

.. 

ail  •  * 

»  Aji  »  Jy-  =“ 

T*'  ... 

Ait  -  iti 

(?l9)a  »  mu, 

»  1  mt:  1 

m 

m,*  » 

Oy  jy* 

and  since 


o./  -  I  aii  1 ; 


Ay*  «  I  Jit 


then 


mi*  «  I  aii  II  Jitl 

and  by  dehnition  of  addition  and  equality  of  ordinary  matrices  it  follows 
that 

"i:  -  f:  i: 

i-i  .-1 

but 

—  4*^ 

r|i  *  ^  ji  »  *  ^  a  9 

therefore 

i-»  »-i 

now  consider  the  determinant  of  It  is 


416 


LAWRENCE  HARRI8 


A(?l)  -  det  E{%) 


a}}  ...ali 


aii 


aTl 


<1  •••  ari 


a}7...al: 


alT  •••  oir 


Oil  •••  a,, 


*  ®ii 


Now  since  X  J  J  is  ordinary  cofactor  of  ai  j  in  £(8)  we  have  from  the  well 
known  theorem  on  the  development  of  the  determinant  in  terms  of 
cofactor  that 

i-i  <-i 

therefore 

mx*  “  A(3l)  6ik 

But  mx*  is  element  in  X  row  and  ti  column  of  ordinary  matrix  ma, 
hence,  mu,  is  diagonal  matrix  having  ^(9)  along  the  diagonal  if  i  >>  1; 
and  zero  along  the  diagonal  if  i  ^  k.  Therefore 


m,*  » 

and  hence 

(»a)a  -  A(a)/a 


therefore  from  Definitions  IV,  IX  and  X  it  follows 


«a  -  A(?i)a 


In  a  similar  manner  it  can  be  shown  that 


a«  -  A(a)3 

and  therefore 


a?i  -  ?ia  -  a(«)3 

which  proves  the  theorem. 

This  theorem  admits  of  the  corollary. 


'4 


LINEAR  MATRIX  TRANSFORMATIONS  417 

Corollary.  U  A  n  ’w  double  cofactor  of  a<,  in  double  matrix  then 

53  ”  A(?l)  lik 

;-l 

i:  AiiOi,  -  AW  U 


For 


and 


=  A(?l)  In  ^  ^  =»  2  Afiafk 


m)n  -  AW  /a  -  E  -  L 

j-i  j-i 

Theorem  IX.  All  non-singular  double  matrices  have  inverses.  If  9 
is  a  non-singular  double  matrix  and  91  its  adjoint  then  the  inverse  of  9  is 
31“‘  where 

1 


and 


»  *  ■■  9 

A(a) 


«a-'  =  -  3 


This  theorem  is  an  obvious  consequence  of  Theorem  VTII  for  if  the 
double  matrix  were  singular 

a-*  -  — —  a 
”  A(a) 

would  not  exist  hence  the  condition  is  necessary.  If  A(a)  ^  0  then 


but  a®  =5  A(a)3  therefore 


also 


therefore 


aa-‘  -  a-‘a  =  3 


which  proves  theorem. 


418 


LAWRENCE  HARRIS 


Theorem  X.  All  non-singular  double  matrices  of  the  m***  order 
having  as  elements  ordinary  matrices  of  the  n"*  order  from  a  group. 
For, 

1.  -  C  Definition  VII 

2.  (3l«)(5  -  a(«(5)  Theorem  I 

3.  313  »  aa  =  a  Theorem  VI 

4.  aa-*  =  a-‘a  =  3  Theorem  IX 

It  has  been  shown  previously  that  all  non-singular  double  matrices 
have  an  inverse.  We  will  now  show  that  all  linear  transformations  of 
non-vanishing  determinant  can  be  inverted,  that  is,  equations  (1),  (2), 
(3)  and  (4)  can  be  changed  in  such  a  manner  that  l)ecomes  a  linear 
function  of  the  y,’8. 

Consider  first  the  equation 

m 

OitXi  (i  «  1,  ...  ,  m) 

j-i 

having  as  its  associated  double  matrix  31.  Multiply  this  equation  on  the 
right  by  A  u  and  get 

m 

Aalfi  ^  {if  h  »  1,  ...  ,  m) 

i-l 

Now  take  the  sum  with  respect  to  t  and  get 

m  m  m 

23  *  23  23  (A:  »  1,  ...  ,  m) 

»-i  *-1  /-I 

and  interchanging  summation  signs  this  equation  may  be  written 

Ait  Vi  •  JL  (i,  Aik  X,  (A:  »  1,  . . .  ,  m) 

i-l  i-l  \t-l  / 

but  from  corollary  to  Theorem  VIII 

23  Anatj  s»  A(3l)/*/ 

I-l 

therefore 

23  AitVi  «  23  A(3l)/t,xy  (A;  »  1  . . .  m) 

•-1  »-i 


LINEAR  MATRIX  TRANSFORMATIONS 


419 


where  /*/  is  ordinary  identity  matrix  if  k  j  and  ordinary  sero  matrix 
if  k  ^  j,  therefore 

A(a)x,-  ^  ^  AikVi 

»-i 


or 


L 


Xi 


A(a)  S 


which  system  has  for  its  associate  double  matrix 


1 

A(a) 


since  A  a  is  cofactor  of  oy  in  ?f.  Hence  we  may  write  that 

X  =  n-^Y 


but  original  equation  was  of  form 

Y  =  ax 

and  multiplying  on  right  by  get 

a-‘r  =  a-‘(ax)  -  (a-‘«)x  -  3x  »  x 

which  is  exactly  the  same  as  equation  previously  obtained.  We  are 
therefore  justified  in  using  the  double  matrix  notation  in  connection  with 
equations  of  form  (1).  By  using  methods  exactly  the  same  as  the 
above  it  can  be  shown  that  notation  is  completely  applicable  also  to  the 
other  forms  of  the  linear  matrix  equation  and  will  yield  solutions  com¬ 
pletely  analogous  to  the  above. 

In  this  essay  an  attempt  has  been  made  to  develop  the  theory  of  linear 
matrix  equations  in  a  manner  closely  paralleling  that  which  was  used 
in  the  case  of  ordinary  linear  equations.  To  do  this  the  concept  of 
double  matrices  was  introduced.  It  may  be  remarked  at  this  point 
that  the  theory  of  matrices  whose  elements  are  matrices  could  have  been 
completely  developed  without  the  introduction  of  linear  matrix  equa¬ 
tions  but  the  author  felt  that  such  a  development  would  have  been 
unsatisfactory  because  of  the  seemingly  formal  character  of  such  a 
work.  It  would  have  been  possible  to  completely  solve  the  linear 
matrix  equations  without  the  introduction  of  double  matrices  but  such  a 


420 


LAWRENCE  HARRIS 


development  would  not  have  shown  the  interconnections  between  the 
various  forms  of  the  equation.  By  keeping  both  types  of  development 
in  mind  it  has  been  possible  to  develop  a  theory  which  is  completely 
analogous  to  the  matrix  theory  of  ordinary  linear  equations. 

There  still  exists  one  form  of  linear  matrix  equation  which  has  yet 
to  be  solved.  The  general  form  of  such  an  equation  is 

!/•  -  23  (t  =  1,  •  •  •  ,  m) 

1-1 

At  the  moment  it  is  not  known  whether  this  equation  will  yield  to  the 
methods  of  double  matrices  but  an  attempt  will  be  made  by  the  author, 
in  the  near  future,  to  find  its  solution. 


I 


STRP>«  FUNCTIONS 
Br  H.  B.  Philup8 

1.  Introduction.  In  two-dimensional  applications  of  the  theory  of 
elasticity,  the  comiwnents  of  stress  are  fr(‘quently  t'xpressed  in  terms  of 
partial  derivatives  of  a  stress  function,  the  precise  expressions  being  so 
chosen  that  the  conditions  of  statical  equilibrium  l>ecome  a  consequence 
of  partial  derivatives  Iwing  independent  of  the  order  of  differentiation. 
The  compatibility  condition  satisfied  by  com[X)nent8  of  stress  then 
becomes  a  partial  differential  equation  for  the  determination  of  the 
stress  function  and  so  of  the  stresses.  A  function  so  intinuitely  relat'd 
to  stress  should  have  mechanical  significance  and  it  is  the  object  of  this 
paper  to  investigate  this  significanct*,  at  Iea.st  in  the  most  common  castes. 

2.  The  Airy  Stress  Function,  ('onsider  a  solid  fre<‘  from  lM)(ly 
forces  in  which  the  stresses  are  functions  of  two  rectangular  coordinates 
X  and  y  only.  To  simplify  the  description  suppost*  x,y,z  a  right-handed 
system  of  axes  with  z-axis  drawn  upward.  Since  the  8tre88<*8  do  not 
depend  on  z  it  is  sufficient  to  describe  them  at  points  of  the  xy-plane. 
Let  Po  be  a  fixed  point  and  P  a  variable  point  of  this  plane.  Join  the  two 
by  a  curve  PoP  and  upon  this  as  Imae  construct  a  vertical  cylindrical 
surface  S  with  generators  of  unit  length.  Take  as  positive  the  riglit 
side  of  this  surface  as  seen  by  an  olwerver  on  the  upper  side  of  the 
xy-plane  moving  from  Po  toward  P.  Let  the  tractions  across  5  exerted 
by  material  on  the  positive  upon  material  on  the  negative  side  have  a 
resultant  of  components  K,,  parallel  to  the  x  and  y  ax(*s,  and  a 
moment  0  al)out  an  axis  directed  vertically  upward  through  P.  This 
moment  0  considered  as  a  function  of  the-coordinates  x,y  of  Pis  the  Airy 
stress  function. 

To  show’  this  we  note  first  that  the  quantities  Rx,  Ry,  and  4>  are  inde¬ 
pendent  of  the  path  from  Pq  to  P.  For  if  two  different  paths  are  taken 
the  corresponding  surfaces  S,  S'  and  two  horizontal  planes  bound  a 
region  in  equilibrium.  By  symmetry  the  tractions  on  the  upper  and 
lower  faces  have  opposite  resultants  and  opposite  moments.  The 
same  must  be  true  of  the  tractions  on  S  and  S'.  But  these  tractions 
are  on  the  positive  side  of  one  surface  and  the  negative  side  of  the  other. 
Hence  the  tractions  on  the  positive  sides  of  the  two  surfaces  have 
equal  resultants  and  equal  moments.  Thus  the  quantities  R,,  R,,  and  ^ 
are  functions  of  x  and  y  only. 


421 


422 


H.  B.  PHILLIPS 


AsBuminR  a  case  where  force  components  and  moments  are  all  posi¬ 
tive,  when  X  is  changed  to  x  dx  and  ytoy-\-dy  the  lever  arms  of  the 
forces  A,  are  increased  the  amount  dy  and  those  of  the  forces 
decreased  the  amount  dx.  Hence 


30 

dy 


R* , 


(1) 


Because  of  the  convention  of  signs  these  formulas,  thus  obtained  for 
positive  values,  are  correct  for  all  values.  Differentiating  the  second 
equation  with  respect  to  x,  we  get 


3*0  dRy 

3x*  “  3j  “  ’ 


(2) 


the  change  of  sign  being  due  to  the  fact  that  a  negative  y-component  of 
force  corresponds  to  tension  and  so  positive  a,.  In  a  similar  way  we  find 


3*0  3*0 


(3) 


Exiuations  (2)  and  (3)  arc  Airy’s  equations  for  stress  in  a  plane  system 
free  from  body  forces.* 

At  the  |K)int  Po  the  function  0  satisfies  the  equations 


dx  dy 


(4> 


Conversely  any  function  0(x,  y),  which  satisfies  these  conditions  at  Po 
and  has  continuous  second  derivatives,  is  the  moment  about  the  vertical 
through  P(x,  y)  of  tractions  determined  by  (2)  and  (3)  on  S.  For, 
taking  x',  y'  as  the  coordinates  of  the  variable  point  on  PoP,  the  moment 
of  those  tractions  is  < 


and  this  has  for  value 


(t  -  *0  ^  +  (»  -  »')  ^  +  ♦(x', »')]'  -  *(i,  y) . 


■  Love,  Theory  of  Eluticity,  fourth  edition,  page  8S. 


STRESS  FUNCTIONS 


423 


From  the  last  equation  it  is  evident  that  a  change  in  the  position  of 
Pa  makes  a  change  in  ^(x,  j/)  which  is  a  linear  function  of  x  and  y.  The 
components  of  stress,  being  second  derivatives  of  are  thus  not  changed. 
Suppose  Pa  is  taken  on  the  boundary  of  the  cross  section.  If  the  bound¬ 
ary  consists  of  a  single  connected  curve  the  value  of  ^  at  any  other 
boundary  point  P  is  then  equal  to  the  moment  about  P  of  external 
tractions  between  Pa  and  P.  When  the  boundary  consists  of  two  or 
more  separate  closed  curv'es  the  stress  function  is  one-valued  if  and  only 
if  the  tractions  on  each  separate  curve  are  in  static  equilibrium.  For 
two  paths  from  Pa  to  P  can  be  made  to  differ  by  a  particular  closed 
boundary  curve.  If  the  two  paths  are  to  determine  the  same  moment 
about  P,  the  tractions  on  the  boundary  curve  must  have  a  moment 
equal  to  aero.  If  this  is  to  be  true  for  every  point  P  those  tractions 
must  be  in  static  equilibrium. 

3.  Stresses  Symmetric  About  an  Axis.  Consider  a  body  generated 
by  rotating  a  plane  area  about  an  axis  in  its  plane,  the  rotation 
being  either  through  a  part  or  a  whole  revolution.  Take  the  axis  as 
2-axis  in  a  right-handed  system  of  cylindrical  coordinates  2,  r,  9  and 
suppose  the  stresses  are  functions  of  2  and  r  only.  In  a  section  A  made 
by  a  plane  through  the  axis  let  Po  be  a  fixed  point,  P  a  variable  point, 
PoP  a  curve  joining  the  two,  and  S  the  surface  generated  by  rotating 
this  curve  through  one  radian  about  the  axis.  Take  as  positive  the 
right  side  of  iS  as  seen  by  an  observer  on  the  positive  side  of  A  and 
moving  from  Po  toward  P,  and  let  ^  be  the  moment  about  the  axis  of 
tractions  exerted  across  S  by  material  on  the  positive  upon  material  on 
the  negative  side.  This  moment  considered  as  a  function  of  the 
coordinates  r,  2  of  P  is  a  stress  function  for  shearing  forces  across  A, 

In  the  first  place  0  is  independent  of  the  path  from  Po  to  P.  For  if 
two  different  paths  are  taken  the  associated  surfaces  S,  S'  and  two 
planes  through  the  axis  bound  a  region  in  equilibrium.  By  symmetry 
the  tractions  on  the  two  plane  faces  have  a  total  moment  zero.  Hence 
the  moments  of  tractions  on  the  positive  sides  of  S  and  S'  are  equal. 
The  quantity  0  is  therefore  a  function  of  r  and  2  only. 

If  r  is  changed  to  r  -f-  dr  the  area  of  S  is  increased  the  amount  r  dr 
and  0  the  amount  r#,  r*  dr.  Hence 

-  r*  r#. .  (5) 

dr 

In  a  similar  way  we  obtain 

(6) 


424 


H.  B.  PHILLIPS 


the  noKative  siKn  l>eing  due  to  the  fact  that  the  traction  is  on  the  inner 
face  of  the  cylinder  r  «■  constant. 

There  are  two  cases  to  which  these  formulas  are  ordinarily  applied. 
The  first  is  that  of  a  shaft  of  circular  section  but  variable  diameter 
stressed  by  tangential  surface  tractions  perpendicular  to  the  axis  and 
functions  of  r  and  z  only.*  If  Po  is  taken  on  the  axis  the  value  of  ^ 


at  a  point  P  on  the  surface  of  the  shaft  is  equal  to  ^  times  the  moment 

about  the  axis  of  tractions  on  the  positive  side  of  the  cross  section 
through  P. 

The  other  is  that  of  a  circular  ring  sector  twisted  by  opposite  forces 
applied  along  its  axis  in  the  planes  through  its  ends.*  In  this  case  the 
function  <t>  is  constant  on  the  surface  of  the  ring  and  the  value  of  the 
constant  may  be  taken  as  zero. 

4.  Torsion  in  a  Bar  of  Constant  Cross  Section.  Consider  a  bar  of 
constant  cross  section  stressed  by  couples  applied  at  its  ends.  In  this 
case  the  important  feature  of  the  stress  distribution  is  that  it  is  the  same 
on  all  cross  sections.  Take  one  of  these  sections  as  x^plane  and  let 
X,  p,  z  he  a  right-handed  system  of  rectangular  axes.  Let  Po  be  a  fixed 
point,  P  a  variable  point  in  the  xy-plane,  and  PoP  a  curve  joining  the 
two.  Through  this  curve  pass  a  cylindrical  surface  S  with  generators 
of  unit  length  parallel  to  the  z-axis.  Take  as  positive  the  right  side  of 
this  surface  as  seen  by  an  observer  on  the  positive  side  of  the  xy-plan'e 
moving  from  Po  toward  P.  Let  0  be  the  z-component  of  traction  exerted 
across  this  surface  by  material  on  the  positive  upon  material  on  the 
negative  side.  This  quantity  0  considered  as  a  function  of  the 
coordinates  x,  y  of  P  is  the  stress  function. 

In  fact  if  two  different  paths  are  taken  from  Po  to  P  the  two  surfaces 
5,  S'  and  two  plane  ends  bound  a  region  in  equilibrium.  Since  the 
tractions  on  the  ends  are  opposite,  the  tractions  on  the  positive  sides  of 
S  and  S'  must  be  equal.  Thus  0  is  a  function  of  x  and  y  only.  When 
y  is  changed  to  y  -|-  dy  the  change  in  0  is  r<,  dy.  Hence 


Similarly 


dx 


Tpg  , 


(7) 

(8) 


•  Timoahenko,  Theory  of  Elasticity,  page  278. 
'  Timoshenko,  loc.  cit.,  page  357. 


STRESS  FUNCTIONS 


425 


the  negative  sign  being  due  to  the  fact  that  in  a  change  from  xto  x  +  dx 
the  added  traction  is  on  the  negative  side  of  the  surface.  These  are  the 
equations  ordinarily  used  to  determine  torsion  in  a  bar  of  constant 
section.* 

Since  the  surface  of  the  bar  is  free  from  traction  it  is  evident  that  ^  is 
constant  on  a  connected  part  of  the  surface.  But  on  parts  of  the  surface 
which  are  not  connected  it  may  not  have  the  same  value.  Then  ^  will 
however  not  be  single  valued. 

It  should  be  noted  that  this  case  may  be  regarded  as  obtained  in  the 
limit  from  the  preceding  one.  To  see  this  consider  a  ring  sector  gen¬ 
erated  by  rotating  an  area  through  part  of  a  revolution  about  an  axis  in 
its  plane.  When  twisted  by  couples  applied  to  its  ends  the  stresses  are 
determined  by  equations  (5)  and  (6).  Leaving  the  cross  section  con¬ 
stant  let  the  radius  of  the  ring  increase.  The  sector  becomes  more  and 
more  nearly  straight  and  the  stresses  approach  those  in  a  straight  bar. 
If  0  is  the  moment  of  tractions  on  the  surface  S  of  §3,  it  is  also  clear  that 

1  1  30 

r*  dr  ’ 

approach  the  limits 

30  30 

dx  ’  dz  ' 

where  0  is  the  z-component  of  tractions  on  the  surface  S  of  §4. 


*  Timoshenko,  loc.  cit.,  paf^  230. 


IN  POSSESSING  AN  M-TUPLY  ORTHOGONAL  NET 
ALONG  THE  LINES  OF  CURVATURE 

Bt  N.  Kaplan 


I.  Introduction.  If  we  consider  a  V«  in  Rw^i  possessing  the  above 
property,  then  with  reference  to  this  coordinate  system  on 


(1.1) 

"  0 

(ap^6) 

(1.2) 

hai  ■«  0 

For  with  reference 

to  the  unit  vectors  t«  along  the  curvilinear  coordi- 

nate  lines 

(2.1) 

im  “  J 

9 

ha  - 

(2.2) 

m 

-  53  “  0 

(a  ^  b) 

#  — 1  9  9 

(2.3) 

m 

hah  ^  2  A  t,  t*  ■■  0 
a~l  a  a  a 

Thus  from  (1),  it  follows  that  the  first  and  second  fundamental  forms 
simultaneously  reduce  to  canonical  form  in 

(3.1)  Fi  »  Qiiidu')*  +  •  •  •  (;««(du")* 

(3.2)  F,  -  hniduy  +  ...  A«(d«")‘ 

Conversely,  if  for  some  orthogonal  net  both  forms  are  reducible  to 
canonical  form  then  the  net  lies  along  the  lines  of  curvature.  For  then 

(4.1)  •  -  0  (a  ^  b) 

(4.2)  =  0 
Now  we  know  that  the  equations 

(5.1) 

9  9  9 

possess  a  unique  set  of  solutions  ^  ;  namely  the  directions  of  principal 

cuiA'ature.  From  (4),  it  follows  that  the  unit  vectors  i  along  the  co- 


426 


SPECIAL  P.  IN 


427 


ordinate  lines  arc  a  possible  solution  of  (5.1).  But  from  the  above 
remark,  these  are  the  only  solutions.  Hence  the  cur\’ilinear  net  lies 
along  the  lines  of  curvature.  Of  course,  we  assumed  that 

(5.2)  ^  hit,  (a 

* 

If  certain  of  the  A.,  are  equal,  then  the  corresponding  lines  of  curvature  ' 

are  indeterminate  and  will  be  assumed  to  lie  along  the  net.  Hence 
from  an  analytic  standpoint  our  in  Rm+i  are  those  for  which  the  two 
quadratic  forms  can  be  reduced  to  canonical  form  simultaneously.  ^ 

II.  The  Gauss-Codazzi  Relations  for  in  Rw^^\.  Let 

(1.1)  (X  »  1  . . .  m  +  1) 

(1.2)  W  (o  -  1  ...  m) 

denote  a  Cartesian  orthogonal  coordinate  system  in  Rm^t  and  the  orthog-  * 

onal  curvilinear  coordinate  83r8tem  in  Vm.  The  Gauss  Relations  are 

(2.1)  I^obtd  *  hafhbj  "  hajfibc 

Dividing  this  set  of  equations  into  three  types,  we  find 


(2.2) 

Kaied  “  0 

(O,  b,  C,  d 

(2.3) 

Kahhf  ■=  0 

(a,  b,  c  9^) 

(2.4) 

Kahha  *  — hh(hahiy 

•  » 

From  Eisenhart  “Riemannian  Geometry,”  pg.  119,  we 
is  satisfied  identically  but  (2.3),  (2.4)  l>ecome 

find  that  (2.2) 

(2.5) 

d»A*  1  dhi  dh.  1  dhi  dhc  _  Q 

0«*  d«*  ha  du*  du‘  he  da*  da* 

(2.6) 

dha 

^  da*J 

-  -hhihaht)' 

u  » 

The  Codazzi  Relations  when  expanded  are 

(3.1) 

(h  -  h\  i..»i-i»  -fh-h\  i..*i-t'  -  0 

\r  r  t  %  \r  •/ r  1  » 

(3.2) 

r,  «,  /  »  1  . . .  m 

428 


N.  KAPLAN 


Uning  the  fact  that  the  lines  of  curvature  lie  along  an  orthogonal  net  in 
Vm,  we  find 


(4.1) 


r  “  h.  ’ 


t'a  ■«  ht  6r» 


From  Kisenhart’s  “Riemannian  Geometry,”  pg.  44,  we  find 


(4.2) 

(4.3) 


r;. 


0 


(a,  b,  c  9^) 


r-  J_^  . 

“  “  2hl  du*  ' 


r:. 


jL^ 

2hl  3u* 


(4.4) 


r*  « 


_L  ^ 

2h\  du‘ 

Thus,  we  find  (3.1)  is  satisfied  identically  and  (3.2)  becomes 
dh 


(6.1) 


(*  *)  S*.  _  0 


(a,  6  BE  1  . . .  m) 


du*  h,  du* 

III.  The  Va  in  We  shall  show  that  any  set  of  vectors  ^  t,  t  •  •  • 

along  the  lines  of  curvature  arc  V,  building.  This  follows  from  an 
expansion  of 

_  pr 

”  h.h. 


(1.1) 


*•***..» 


P  t  r  "p"# 

(1.2)  (p,  9=.l...a)  (r  «  a  +  1 

But  from  (II,  (4.2)) 

(2.1)  ^  r;,  -  0 

Hence 


m) 


(r,  P,  q  7^) 


(2.2) 


»**  *I«.M 

p  q  r 


0 


Thus  our  theorem  is  proved. 

We  shall  next  confine  our  study  to  the  C7  families  of  qo"-*  Fj^i, 

These  are  of  two  types:  (1)  completely  orthogonal;  (2)  1/2  orthogonal. 
We  shall  in  specific  study  the  (m  —  1)  families  of  «  "“*(1/2  orthogonal) 
Vj  of  (a  =  2  •  •  •  m).  W’e  shall  denote  the  fundamental  quanti¬ 

ties  of  these  F*  in  by  (— )  as  in  ^t,  which  we  study  explicitly. 


SPECIAL  F.  IN 


429 


(3.1) 

(3.2) 


I  =  t 

I  a 


I  -  i 
■•+ 1  •*+ 1 


arr  the  V|  normals 

(3.3) 

(3.4) 

(3.5) 


St 


du*  bu* 


b^‘  dll*  2) 

(g,  p  »  3  .  • .  m  +  1) 


V. 


(3.6) 


^  dy*  rf«*  ^  dy*  1 
,  *  du*  ds^  “  du"  Ap 


(p  =  3  • .  •  m) 


Again,  let  xx  denote  the  equations  of  F|  in  Rm^i  and  let  y*  denote  the 
equations  of  Vm  in  Rm^u  Then 


a:*(u‘,  u*)  *  y^(«‘,  u*,  c*  ‘  •  c") 


dx*  dy* 


(a  =  1,  2) 


(4.1) 

(4.2) 

For  Fj  in  Rm+i,  we  find 

(5.1)  •*.  .) 

y  p  ^  \P“3--*m4-l/ 

For  F«  in  Rm-n,  we  have 

(5.2) 


»*+ 1 


.b  dy^ 
-  *-  8? 


(5.3) 

From  (3.6) 


du-du*  *-*dia‘  .VxJ+j 


di* 


(5.4)  i\  «  1  ^ 

;  du‘  bu*  bu”  hp  ^  bu^  bu*  \hpj 

From  (5.1),  (5.4),  we  find  for  (p  =  3  •  •  •  m) 


(p  =  3  •  •  •  m) 


430 


N.  KAPLAN 


From  (5.3) 


(5-6)  S,  k,  “*■  dt  3u-  (ap)  “  5‘-  du*  +  2  f 

Using  (3.6)  and  (4.2),  this  becomes 

[5.7) 


+  -0 
Ap  m+l 


By  transvecting  with  various  independent  vectors; 


(5.8) 

^  ~  ^  (A,)* 

p  ^p 

(p  =  3  . .  •  m) 

(5.9) 

e.  -  - 

■•+ 1 ,  p  "p 

(p  -  3  •  •  •  w) 

(5.10) 

«  pr 

rp  «p 

(r,  p  »  3  •  •  •  m) 

(5.11) 

(identically) 

From  (II,  4.2),  (1, 1.2),  we  find  using  (5.9),  (5.10) 

(6.1) 

From  (5.8) 

tip  =  0 

9ff 

(p,  g  -  3  • .  •  m  +  1) 

(6.2) 

o 

II 

(a  6) 

(6.3)  -rfh 

p  .  Ap  Ap  du^ 

From  (5.2),  (5.1)  and  (6.1),  we  find 

(6.4)  Spt  =  A,*  (o,  6  »  1, 2)  (p  »  3  .  •  •  m) 

•1+1 

From  (6.2),  (6.3),  we  have 

(6.5)  Apt  «  A  t.  u  +  ^  u  it 

p  i»i  1  1  pf  1  t 

^  1  dhx  ^  ^  1  dhf 

pi  Ai  Ap  du^  ’  p t  fh  hp  du^ 


(6.6) 


SPECIAL  F.  IN 


431 


We  now  study  the  Gauss,  Codaszi  Relations  for  this  Ft  in  Rm+i.  Note 
from  (6.1),  (6.4),  (6.5)  that  the  Ricci  Relation  is  identically  satisfied. 
The  Codaszi  Relation  becomes  by  (II,  (5.1)) 

/ _ \_ahjL,  _1_ 

.  .  a  /  1  dh,\  \  h,  h„  du”  A.  h,  du*)  dh^ 

'  ^  ^  aM*\  KKdu*)  K  du*" 


(7.2) 


(7.3) 


1  yA.  _1_  . _ 1_  dK  ^ 

A.  A,  du*  dw'  A.  h\  dM'  »«*  a;  A,  dM*  du' 

1  dA«  dAt  1  dAa  dAa 
A.  A*  A,  ^  ~  Aj  A,  ^ 

y  A.  1  dA«  dAp  1  dA.  dhs 

dti*  di4'  Ap  du*  du*  A*  du*  dw'  ” 


These  equations  are  exactly  the  Gauss  Equations  II,  (2.5) 
when 


(a,  5  »  1,  2;  p  *  3  •  •  •  m) 


Now  the  Gauss  Relation  for  Ft  in  Rm+\  is 

R^m  ^  ^  \ 

\  P  P  P  P  P  / 

=  —  X)  ^  ^ 


(8.1) 


Since 

(8.2) 

And 

(8.3) 


R^  -  fr  —)  +  -Ar—^] 

\A<.  du-/  ^  d«*  \A*  dti*/  J 


(a  9^  b) 


Hence  the  equations  (8.1)  reduce  to  type  II,  (2.6).  If  we  consider  all 
the  C7  families  of  « "^‘Ft  of  ^t,  (a,  6  «  1  •  •  •  m)  in  Rm+i  we  see  that 

their  Gauss,  Ricci,  Codazzi  Relations  are  the  equations  II,  (2.5),  (2.6), 
(5.1).  Thus  the  Gauss,  Codazzi  Relations  of  F.  in  Rim-i  do  not  condi¬ 
tion  these  Fj  in  Rm+u  Now  the  F»  can  be  generated  by  the  (m  —  1) 
families  (o  »  2  •  •  •  m).  Hence  we  have  the  theorem  “F»  in 


432 


N.  KAPLAN 


Rm-^i  ixisHossing  an  M-tupIy  orthogonal  net  along  the  lines  of  curvature 
consist  of  (m  —  1)  families  of  x  •»-*  V*  in  iif»+i  which  are  (1/2)  orthog¬ 
onal  and  possess  the  fundamental  quantities 

(9.1)  Qab  =*  Qak  ^  h,hi,6^  (a,  b  =  1,  2;  1,  3;  •  •  •) 

(9.2)  =  0  (p,  g  =  3  •  •  •  m;  2,  4  •  •  •  m;  •  •  •) 

(9.3) 

p  hp  du* 

(9.4)  hti  =  hhahi.Spb 

m-f  1  m 

From  Eisenhart  “Riemannian  Geometry,”  pg.  189,  we  can  easily  show 
that  ”A  necessary  and  sufficient  condition  that  Vj  in  Rm+i  be  a  Fi  in 
Rk  is  that 

(10.1)  ^  «  0  (q,  p  ^  k  •  •  •  m) 

P 

(10.2)  Ca  -  0 

p  f 

Hence,  we  see  that  ‘‘A  necessary  and  sufficient  condition  that  any  family 
of  Fj  lie  in  Rk  is  that 

(lU)  (a.I.r) 

(11.2)  let  g  =  1,  •  •  •  r  •  •  •  ,  m  -|-  1  lie  in  then  p  takes  on  values  in 
the  set  (1, 2,  •  •  •  m  -1-  1)  which  are  not  q 

In  particular  “A  sufficient  and  necessary  condition  that  the  generating 
Fj’s  lie  in  Rii  is  that  ^le  fundamental  form  of  F«  be  of  the  type 

(12.1)  ds»  =  (du')*  +  /*(«'»  u*)(dM*)*  -h  . .  •  /„(u>,  u-)(du")* 

In  this  case,  the  curves  of  parameter  u'  are  geodesics  in  V^.  Similar 
results  follow  from  an  examination  of  the  F,’s  in  F*. 

In  this  paper,  the  existence  of  the  was  not  proved.  Properties 
of  the  Vm  were  discussed  to  lay  a  basis  for  the  general  existence  proof 
which  will  follow  in  a  future  paper.  However,  a  simple  integration  will 
suffice  to  show  that  F.  of  the  type  discussed  and  possessing  linear 
clement  (12.1)  do  exist. 


THE  INDETERMINATE  AND  COMPOSITE  PRODUCTS  OF 
MATRICES* 

Bt  G.  W.  Kino 

In  a  linear  algebra  the  most  general  combinatory  relation  between  p 
n-fold  quantities  gives  rise  to  elements  in  an  n'-dimensional  manifold. 
For  the  combination  of  two  entities  A  aie<  and  B  =  b/€„  where  the 
e’s  are  reference  points,  basis  elements  or  functions,  gives  AB  =  a,b,c,cy. 
The  most  general  assumption  is  that  there  are  no  linear  relations  between 
the  symbols  («<«/)  which  yield  n*  new  elements  defining  a  manifold  in 
which  A B  is  an  entity. 

This  product,  the  indeterminate  or  outer,  of  2-way  matrices  (giving 
2p-way  matrices)  is  primarily  the  subject  of  this  paper.  Except  for 
concepts  of  rank  it  is  generally  sufficient  (in  an  associative  algebra)  to 
consider  certain  2-way  representations  rather  than  the  purely  formal 
2p-way  product.  Under  the  name  composite  matrix  the  2-way  repre¬ 
sentation  of  the  indeterminate  product  is  induced  in  the  composition  of 
linear  forms.* 

If  there  are  two  function-manifolds  where 

t  =>  1  to  mA 
j  —  1  to  71.4 

,  ,  .  (1) 

/c  SK  1  to  ma 

/  ai  1  to  na 

then  the  manifold  of  the  functions  (xy)  has  a  composite  for  an  induced 
matrix. 

ixyVp  * 

*  Contribution  from  the  Research  Laboratory  of  Physical  Chemistry,  M.  I.  T., 
No.  342. 

Most  of  this  paper  was  presented  as  a  thesis,  part  of  the  requirements  of  a 
course  in  the  Algebra  of  the  Quantum  Theory  given  by  Prof.  F.  L.  Hitchcock 
(M.  I.  T.),  who  suggested  the  topic,  and  to  whom  I  express  my  thanks  for  his  very 
active  interest  in  the  preparation  of  the  work  for  publication. 

*  C.  Stephanos,  Jnl.  math,  pures  et  appl.,  6,  73-128,  1900;  H.  Weyl,  Gruppen- 
theorie  und  Quantenmechanik,  Leipzig,  1928,  p.  78.  L.  H.  Rice  considers  these 
matrices  in  certain  tensor  transformations.  Adjoint  and  inverse  determinants  and 
matrices,  Jnl.  Math,  and  Phys.,  6,  55-64,  1925.  The  original  inventor  seems  to  be 
Zehfuss,  Zeit.  fur  Math.  u.  Phys.,  3,  298, 1858. 

433 


X  j  =  a,-,'  Xy 

Vk  “  b*i  yi 


434 


G.  W.  KING 


derived  fmm  (x<y*)a  »  Si/bw  {x,yt)ii 

“  II  II  *  II  (a»;b/u)(a)  00  II  .  (2) 

where 

p  B  (ik)  »  11,  12, - Imb,  21  • .  •  2ma,  m^l  •  •  •  mAma 

q  s  (jl)  s  11,  12, - ln»,  21  •••  2n*,  n^l  •••  n^n* 

where  ik  and  jl  are  bipartite  indices,  each  pair  being  considered  as  a 
single  index.  The  rows  (columns)  are  ordered  on  the  first  index,  then 
on  the  second  index  in  the  following  manner:  the  row  (column)  (r«) 
stands  before  or  after  (tu)  according  as  r  is  less  than  or  greater  than  t, 
and  the  row  (column)  (r«)  comes  before  or  after  (ru)  according  as  s  is 
less  or  greater  than  u. 

The  induced  matrix  4^  is  a  composite  matrix  whose  elements  are  all 
possible  (binomial)  products  of  the  elements  of  the  factors  A  and  B  (one 
element  from  each)  arranged  according  to  the  above  convention.* 

A  variety  of  names  have  been  used  for  these  matrices;  the  word 
direct  is  distinctly  at  variance  with  Gibbs’  language  (direct  being  ordi¬ 
nary  matrix  multiplication),  but  has  precedent  in  that  the  direct  product 
of  two  groups  whose  elements  are  matrices  is  a  group  of  composite  mat¬ 
rices.  We  prefer  to  conform  to  the  terminology  of  multiple  algebra,* 
namely  the  indeterminate  product,  or  more  exactly  from  the  polyadic 
point  of  view,  the  multiple  indetermiruUe  product,*  reserving  the  word 
composite  for  the  2-way  representation  in  multipartite  indices. 

Again,  the  83rmbol  X  mostly  used  is  an  unfortunate  choice  in  relating 
it  to  the  ideas  of  Gibbs,  who  used  |  or  conjunction  without  a  sign  for 
indeterminate,  and  the  cross  for  skew  products  of  multiple  quantities 
(not  only  of  vectors).  However,  to  avoid  confusion,  and  especially  as 
the  composite  product  of  matrices  differs  in  its  conventions  from  the 
indeterminate  product  of  dyads,  we  shall  use  the  symbol  X,  which  of 
course  must  not  convey  any  idea  of  skewness.*  Thus 

♦  -  A  X  B  .  (3) 

*  A  composite  product  of  two  3  by  3  matrices  is  exhibited  by  Rice,  id. 

*  J.  W.  Gibbs,  Multiple  Algebra  in  Collecled  Papers  Ijongmans. 

*  F.  L.  Hitchcock,  A  theory  of  ordered  determinants  with  appliealion  to  polyadies, 
Jnl.  Math,  and  Phys.,  4,  205-37,  1925. 

*  Other  symbols  are  A*  X  B,  C.  C.  MacDuffee,  The  theory  of  Matrices,  Berlin, 
1933,  p.  81,  A  <B>  used  by  W.  E.  Roth,  On  Direct  Product  Matrices,  Bui.  Am. 
Math.  Soc.,  40,  461-8,  1934,  and  D.  E.  Rutherford,  id.,  39,  801-08,  1933.  The 
notation  A  X  B  is  used  by  Stephanos,  Weyl,  and  R.  Oldenburger,  Ann.  Math., 
35,  622-54,  1934. 


INDETERMINATE  PRODUCT  OF  MATRICES 


435 


The  multiplication  is  not  commutative,  but  the  two  products  are 
equivalent  matrices — indeed  may  be  converted  into  each  other  by 
merely  interchanging  rows  and  columns.  This  the  ph3r8icist  is  at 
perfect  liberty  to  do  when  setting  up  the  equations  (1). 

Further  it  is  seen  that  this  multiplication,  in  the  case  of  continued 
products,  is  associative — the  row  indices  (column  indices)  of  the  factors 
forming  the  multipartite  row  (column)  indices  of  the  composite, 
arranged  according  to  a  dehnite  convention,  vis.,  the  natural  ordering, 
defined  as  follows.  Let  u  be  the  first  index  of  the  scries  tit*  •  •  •  i, 
which  has  a  different  (fixed)  value  in  two  chosen  rows  (columns) ;  that 
row  (column)  precedes  in  which  t*  has  the  smaller  value.  For  the 
tri  linear  form 

ixy2)’ikm  =  &iihklCmn(xyz)iln 

can  be  obtained  by  composing  either  (xy)  with  z,  or  x  with  (yz)  by  the 
above  conventions. 

(A  X  B)  X  C  ~  A  X  {B  X  C) 

The  combinatory  process  is  distributive  with  respect  to  addition — 
justifying  the  word  multiplication. 

If  Xiiyk  +  z«)  »  (xiyk  +  XiZm) 

then  on  making  the  linear  transformations  we  have 

a,yXy(b«yi  +  C«,Z,)  -  (tLijhuXjy,  +  a<yc«,xyz0 

or  ^  X  (B  +  C)  -  (il  X  B)  +  (A  X  C) 


The  indeterminate  product  being  by  definition  induced  in  a  manifold 
of  nAnantAniB  dimensions,  is  a  4-way  matrix.  The  bipartisation  gives  a 
representation  of  this  4-way  in  a  composite  manifold  of  ordinary 
matrices  (but  of  higher  order).  Indeed  the  multipartisation  of  the 
2p-way  matrix 


a<‘> 

a.  1  < , 


l'*) 

j 


fAp) 


(4) 


is  nothing  more  than  a  method  of  writing  it  as  a  2-way  matrix,  whose 

rows  are  the  it - tp-couches  and  columns  the  j\  •  •  •  jp-couches,‘  which 

matrix  has  some  peculiar  properties. 

If  a  2-way  matrix  be  partitioned,  it  is  well  known  that  we  may  treat 
it  as  if  it  were  compound,  a  matrix  with  elements  that  are  matrices  (the 


*  For  definition  of  couches,  and  couche-representation,  see  L.  H.  Rice,  Couehe 
rankt  and  the  general  matrix,  Jnl.  Math,  and  Phya.  7, 93-6, 1928. 


436 


G.  W.  KING 


partitions).  The  matrix  (4)  may  be  said  to  be  a  compound  matrix 
whose  y’th  element  is  the  2p-2-way  couche  of  aspect  ij, — namely  a</ 
times  the  matrix  which  is  the  indeterminate  product  of  the  last  p  —  1 
matrices.  Again  these  matrical  elements  have  elements  which  are  the 
indeterminate  product  of  the  p  —  2  postfactors,  and  so  on.  The 
composite  matrix 

•  •  •  •  X  Ak^t  X  •  •  •  •  X  Ap 

may  be  said  to  form  the  (ti  •  •  •  i*)Oi  •  •  •  i*)’th  partition  of  the  Ar’th 
genera/ ton. 

This  view  is  intermediate  between  a  2p-way  and  a  2-way  matrix 
representation,  and  has  special  interest  for  us.  For  in  the  4-way  matrix 


^i/ti  “  a<,b*,  .  (5) 

and  as  a  compound  matrix 

♦  -  II  in  II  where  4^^  »  ||  a<,B  ||  .  (6) 


That  is,  we  consider  the  indices  y  to  be  the  locant  of  the  matrical 
elements  of  'If,  of  order  m.i,  (m.t  rows,  ha  columns),  the  elements  being  of 
order  of  B.  Or,  in  the  sense  of  partitions  (7),  the  y’th  block  is  the  matrix 
B  times  the  scalar  a,y. 

Since  all  the  elements  are  multiplied  by  the  same  quantity,  'k,  as  a 
compound  matrix,  can  be  factored  into  the  direct  product  (ordinary 
matrix  multiplication)  of  a  scalar  compound  matrix  6«  whose  Ha  diagonal 
elements  are  the  matrices  B,  by  a  compound  matrix  <l>4  whose  y’th  ele¬ 
ment  is  the  scalar  matrix  Aalh,  of  order  nu. 

The  composite  matrix  A  X  B,  exliibited  by  blocks. 


or  into  O**’*!/’  whore  the  blocks  of  are  now  of  order  Hb,  and  the 
number  of  non-zero  (main  diagonal)  elements  of  0,*’  is  now  vtA.  Then 

♦  -  A  X  5  »  ♦4 -e.  -  .  (8a) 

In  discussions  of  many  of  the  succeeding  equations  it  is  important  to 
remember  the  orders  of  the  matrices.  Beneath  these  equations  we 
shall  write  the  orders  as  if  the  matrices  were  compound.  Thus 

R 

n,n,,  nan4 

indicates  R,  as  &  com{x)und  matrix,  has  ni  rows,  ni  columns  and  the 
matrical  elements  have  nt  rows  and  n4  columns.  This  notation  also 
shows  the  values  the  indices  (equation  5)  register,  t  >>  1  to  ni,  j  to  nj, 
A:  to  ni,  f  to  n4.  Then  we  shall  have  as  an  auxiliary  equation  to  (8a) 

m4  m.,  tia  riB  »  m4  m»,  n4  m»  -iiAmB,  riA  n» 

*=  m4  ms,  m4na-m4n«,  n4n« . (8b) 

We  have  thus  replaced  the  composite  product  by  an  ordinary  product 
of  two  compound  matrices  of  simple  structure,  4>4  and  6a,  functions  of 
A  and  B  alone  (but  whose  orders  are  determined  by  B  and  A). 


438 


G.  W.  KING 


PROPERTIES  or  4>  AND  6 

is  a  scalar  compound  matrix  of  arbitrary  order,  rie,  whose  diagonal 
elements  are  the  matrices  P,  i.e.  /■«  X  P.  As  a  2-way  compound 
matrix,  ||  6n  ||,  Bn  »  P,  the  order,  is  riemr,  n^nr.  As  a  4-way  matrix 

»  i«yPu  . (9) 

(the  true  significance  of  the  direct  product  of  two  4-way  matrices  will  be 
taken  up  later). 

Thus  the  layers  of  aspects  i  and  j  are  identical,  and  the  tj-couches  are 
symmetrically  placed  about  the  ij  directions. 

4>r  is  a  compound  matrix  of  the  order  of  P  whose  tj’th  element  is  the 
scalar  matrix  Paln^,  of  arbitrary  order  n^,  i.e.  P  X 

♦p  “  II  iPil  II  <f>n  - 

As  a  2-way  compound  matrix  the  order  is  m^n^,  nrn^.  As  a  4-way 
matrix 

JfijW  ”  iwPii  .  (10) 

4>r  is  symmetrical  in  the  three  dimensional  plane  perpendicular  to  the 
symmetry  plane  of  0r. 

Theorem  1.  The  functions  «I»  are  not  commutative  with  each  other; 
nor  are  the  functions  0;  but  the  4>’s  are  commutative  with  the  0’s  in  the 
sense  that  the  forms  are  preserv'ed,  although  the  orders  are  changed. 
Writing  (8a)  more  explicitly, 

♦  «AXP-(AX  X  P)  -  (/«4  X  B)iA  X  /„)  • . .  (8c) 

Corollary.  If  their  arguments  (i.e.  A  or  B)  are  square  (but  not 
necessarily  the  same  order)  the  matrices  4>4  and  0a  are  commutative  in 
the  exact  sense,  for  in  equation  (8)  and  0a  *=  0a*'-  This  is 

the  theorem  given  by  Weyl  and  Stephanos, 

A  X  P  -  (A  X  /b)(/.  X  P)  -  (/.  X  P)(A  X  /b) 

Theorem  2.  4»e^  »  0*^  with  no  change  in  the  arbitrary  orders.  For 

by  the  associative  law 

(7,^  X  P)  X  7^  -  X  (P  X  7,^) 

0‘ 

.  -Op  Wo*  »•  Wo' 

*0P  <-» 


Theorem  3. 


INDETERMINATE  PRODUCT  OF  MATRICES 


439 


by  expanding  aa  I  X  P,  and  by  the  associative  law.  Similarly 

<-r 

**• .  -  n*.  »  n*. 

'♦p 

Mixed  functions  4>‘ei  reduce  to  «I>e-  a*  -•  n*.  and  ne  *  IT  • 

0*.  i  i 

Theorem  4.  By  suitable  interchanges  of  rows  and  columns  4>^  can  be 
converted  into  rie  »  n«  or 

PXl^-^In^XP 

More  precisely,  there  are  n«!  different  operators  Ex  (  )Et  such  that 
Ex{^r)E\  *  0p 

m,n*  •  m,n*,  n,n^  •  n,n^,  n*n, 

El  and  Et  are  derived  from  indentity  matrices  of  orders  m^n^  and  n^n« 
respectively  by  transposing  rows  and  columns  according  to  a  definite 
plan: — 

^1  ■=  II  (*<a»*j*)(»*)(;0  II  or  II  .  (11) 

the  bipartisation  being  understood. 

“  II  (5«4y,*)<y«  II 

The  indices  take  on  the  values  indicated  by  the  auxiliary  equation, 
and  a  signifies  any  permutation  of  the  numbers  1  to  n«  (but  must  be 
the  same  permutation  in  both  Ex  and  Et),  t«  being  the  cardinal  number,  t 
the  ordinal  number  in  the  a-permutation. 

Ex  resembles  Ei  in  structure,  and  when  P  is  square  Et  is  the  inverse 
(and  transpose)  of  Ex',  further  if  the  normal  order  is  used,  t,  (say 
a  K  1),  Ex  and  Et  are  equal. 

We  shall  refer  to  the  above  2n«  !  matrices  as  permulalionr-matricee  of 
type  E.  They  are  unitary.  When  a  »  1  the  matrices  are  involutory 
and  hermitian, 

E  ~  E-'^  -  -  j?' 

An  expression  for  type  E  matrices  that  is  often  useful  is  the  following, 
using  bipartite  indices  of  equation  (2).  Let  ii  be  the  permutation  of 
the  numbers  1  to  mrn^  that  changes  the  bipartite  index  p  »  {ik)  into 


440 


O.  W.  KING 


(ki) ;  similarly  c,  a  permutation  of  the  numbers  1  to  changes  g  ~ 
00  into  (Ij).  Then 

-  II  II 

.  (12) 

e,  -  II  (»^j„  II 

where  atj  indicates  the  product  of  the  two  permutations  symbolised 
by  a  and  r/. 

Proof  of  theorem. 

On  expanding  the  left-hand-side,  remembering  ■*  the 

ijkl'th  element  is 

Let  the  inverse  of  a  permutation  o  be  called  /3,  i.e.  if  t,  =  j  then  i  *  j$. 
Thus  the  ijkVth  element  is 


^ifiiaPu  ■■  SiiPkt 

or  the  ijkVth  element  of  0r,  which  was  to  be  shown. 

Corollary  1.  The  matrix  0#»  can  be  changed  into  <!>#•  by  no  !  operators 
composed  of  permutation-matrices  of  type  E. 

E\'‘  (0/.)  =  4>j.  n*  =  no 

n»rno,  n^mp  •  n^mp,  n^rip  •  non#-,  upTIq  =  mpn^,  npn^ 

-  II  (5.#5,.*)o«  II 
“  II  (hij.6ik)xiki  II 

When  P  is  square  these  are  Et  and  Ei. 

If  the  normal  order  is  used,  a  =»  1,  then 

Ei{Qp)Ei  «  4>#i 

Corollary  2.  There  are  operators  y’i(  )Fj  that  are  identity  operators 
when  acting  on  0;  Ki.j  are  unitary  and  derived  from  the  identity  matrix 
by  interchanging  rows  and  columns  according  to  the  scheme 

Yi,i  =  II  (5<;,5w)</h  II 

the  indices  having  different  ranges  for  Kj  (a  has  same  value).  There  are 
no  !  different  K’s  of  given  order. 

Proof.  The  existence  of  this  theorem  is  obvious,  for  we  merely 


INDETERMINATE  PRODUCT  OF  MATRICES  441 

propofle  to  interchange  the  matrices  P  which  are  repeated  along  the 
main  diagonal  of  0^.  The  ijW’th  element  of  Ki(0O  K*  is 

=  *  4<jP*I 

which  is  the  ijkVth  element  of  Or. 

If  P  is  square  }’i  is  the  inverse  of  Ki.  If  a  »  1,  Y\,i  are  involutory 
and  hermitian.  Finally  if  P  is  square  and  a  =  1,  Fi  »  Ft. 

A  corresponding  corollary  exists  for  4>/».  The  operators  are 
Y\Ei{  )E%Yt.  Note  that  there  are  still  only  n*  I  different  operators 
as  the  a-permutation  of  the  E  and  Y  apply  to  the  same  index  in  the 


product,  and  permutations  belong  to  a  finite  group. 

Corollary  3.  II 

AXB  •  ExO^^W,'^  -  F^Fm 

«  =  GbGa 

Pe  =■  II  (6ikPii)iiU  II  »  II  (6uPik)iiki  II  .  (13) 


Now  we  have  the  composite  product  expressed  as  an  ordinary  product 
of  two  similar  functions  of  A  and  B,  whose  orders  depend  on  B  and  A. 
If  A  and  B  have  the  same  orders,  the  functions  will  be  identical. 
Finally  if  A  *  P,  then 

F  ^  G  yet  F*  ^  (P 

Theorem  5.  The  functions  '!»( >  or  0(  >  of  two  matrices  combined  by 
the  following  algebraic  symbols  are  equal  to  the  same  combinatory 
process  between  the  functions  of  the  separate  matrices.  Thus 


Or-Q  “  0^*0g  .  (14) 

since  0  is  a  tcalar  compound  matrix. 

g  =  EiOfOgEg  s»  EiOrEfEx^gEi  =  *l>/>*4>g  .  (15) 

Qeyg  =  0;v  -  0,  X  0g  .  (16) 

*Fyg  =  4>*i?)  •  'Q*g’^*eX*g  .  (17) 

0e+g  =  0r  -f-  0g  .  (18) 

♦r+g  »  Pi(0#>  +  Og)Ei  =  .  (16) 


442 


O.  W.  KINO 


Theorem  6.  The  transpose,  hemnitian  conjugate,  adjoint,  inverse 
(hereafter*  will  be  used  to  denote  any  of  these)  of  a  function  (0(  >  or  4>(  >) 
is  equal  to  the  function  of  the  transpose,  hermitian  conjugate,  adjoint 
or  inverse,  as  the  case  may  be,  of  the  argument. 

e;  -  e,.  -  4>,.. 

Proof.  In  the  case  of  0  the  first  and  second  follow  by  definition.  The 
other  two  follow  from  the  identities 

/  =  0;^  »  0,-.  0,.  -  (0,.)-‘  0, 

The  theorem  is  true  for  «I>  since 

4>;  -  (£:.0|.£.)*  -  Ete^.Et 

mpTi^,  •  n^UF,  n^mF  *  n^mF,  mFfi^ 

by  ordinary  matrix  theory.  Since  Ei  and  Et  may  be  chosen  to  be 
involutory,  the  last  product  is  4>^.  by  theorem  4,  corollary  1. 

Theorem  7.  If  P  is  orthogonal,  unitary  or  hermitian,  so  are  the 
functions  4>e  and  0e. 

Proof.  First  we  prove  the  theorem  for  0,  by  the  identities 
PP'  -  I, 

0/^  =  OfQf'  by  theorem  5 

*  0,(0^)'  by  theorem  6 

which  is  the  condition  for  0p  to  be  orthogonal.  The  proof  for  the  unitary 
character  is  along  similar  lines.  To  show  Qf  is  hermitian  if  P  is,  we  have 

P  -  P' 

» 

whence  0p  «  0?'  (0#.) '  by  theorem  6.  Since  E\,  <  can  be  chosen  her¬ 

mitian,  the  operation  Pi(  )  Et  will  then  not  change  the  orthogonal, 
unitary  or  hermitian  character  when  applied  to  0  to  give  4>. 

Theorem  8.  If  H~'P  H  T,  &  diagonal  or  triangular  (elements 
below  the  main  diagonal  are  zero),  matrix  then 

0g_i0p0jf  «  0r 

and  similarly  for  4>.  <t>  and  0  of  diagonal  or  triangular  matrices  are 
diagonal  or  triangular.  The  proof  follows  from  theorem  5, 0^  Qb-ifb 
which  is  equal  to  the  three  products  on  the  left.  The  proof  for  •t  is 
parallel. 


INDETERMINATE  PRODUCT  OF  MATRICES 


443 


Theorem  9.  Thp  roots  of  (the  main  diagonal  elements  of  0r)  are 
the  same  as  those  of  P,  repeated  »io  times.  The  roots  of  4>r  are  the  same, 
repeated  n«  times,  for  since  f*  is  of  necessity  squart*,  Et  ^  whatever 
a  is,  and  E{  )  E~^  does  not  change  the  roots. 

Theorem  10.  The  determinant  of  0#>  (and  <l>r)  is  the  determuiant  of  P 
raised  to  the  n^’th  (or  n«’th)  power,  as  is  obvious  from  the  structure  of 
Qr,  and  the  fact  that  |  £  |  is  unity  proves  the  theorem  for  4>r. 

Theorem  11.  The  ranks  of  Or  and  4>r  are  both  equal  to  the  rank  of  P 
multiplied  by  ne(or  n«). 

Proof.  A  row  of  Or  is  a  row  of  P  encased  in  zeros,  and  each  row  is 
repeated  times. 

E\{  )Et  does  not  change  the  rank,  as  Ei,  i  are  iionsingular.  Thus  4>r 
has  the  same  rank  as  Or. 

Theorem  12.  The  trace  of  Or  and  ♦r  is  n®,*  timis  the  trace  of  P. 

The  fact  that  the  proofs  of  theorems  5  to  12  are  based  on  the  same 
type  of  syllogisms  leads  us  to  seek  a  general  theorem  of  which  the  others 
are  special  cases.  First  we  note  the  matrix  Or  is  the  direct  sum^  say 

Or  -  P-i-P  4-  ...  iP,  .  (20) 

P  written  rie  times.  I^t  S  be  any  algebraic  operation  or  condition 
(examples  are  given  below).  The  desired  theorem  is 

Theorem  13.  The  symbol  S  is  distributive  with  respect  to  direct 
addition. 

S(P, -i-Pi-i-...  -j-P.)-SP,-i-SP, +  ...  4-SP, 

The  symbol  S  may  stand  for  the  adjoint,  inverse,  transpose  or  hermitian 
conjugate,  or  for  “the  determinant  of”  (the  direct  sum  of  determinants  is 
their  product),  or  for  “rank  of,”  “trace  of”  (the  direct  sum  of  scalars  is  the 
ordinary  sum).  Again  S  may  83rmboli8e  a  combinatory  process  between 
two  direct  sums;  the  theorem  states  that  the  combination  is  the  direct 
sum  of  the  same  combinatory  process  between  corresponding  terms  of 
the  factor  sums.  Such  processes  are  ordinary,  composite,  indeter¬ 
minate,  or  double  dot  multiplication,  or  ordinary  addition.  Further  S 
may  indicate  the  orthogonal,  unitary  or  hermitian  character. 

Proof.  The  direct  sum,  the  most  general  addition  of  two  matrices,  is 
induced  in  a  manifold  which  is  reducible.  The  sub-manifolds  are 
independent,  but  when  an  operation  is  applied,  it  is  assumed  to  be 
carried  out  in  all  the  sub-manifolds,  which  is  the  theorem. 

^  MarDuffee,  loe.  eit.  gives  a  definition.  “Indeterminate”  would  be  a  name 
more  in  keeping  with  our  language. 


444 


G.  W.  KING 


The  function  4>,  is  also  a  direct  sum  written  out  by  a  different  con¬ 
vention  (permutations of  the  row  and  column  order  of  Or).  The  permuta¬ 
tion  operator  Fi(  )Et,  a  1,  being  hermitian  rotates  the  axes  of  the 
basis  elements  in  the  manifold  of  the  direct  sum.  Similarly  Fr  and  Gr 
(13)  are  direct  sums. 

The  theorems  concerning  the  composite  product  A  y.  Bvte  are  about 
to  present  may  be  considered  corollaries  to  a  similar  general  theorem : — 

Theorem  14.  S  is  distributive  with  respect  to  composite  multiplication. 

Proof.  We  have  shown  A  X  B  Fa-Fb.  The  dot  (ordinary)  multipli¬ 
cation  is  really  summation  of  corresponding  elements  of  Fa  and  Fb.  If  S 
is  distributed  among  the  sub-manifolds  of  F  (Theorem  13),  it  will  be  dis¬ 
tributed  among  the  sum  of  corresponding  sub-manifolds. 


THEOREMS  ON  COMPOSITE  PRODUCTS 

Composite  multiplication  is  not  commutative,  is  associative  and  is 
distributive  with  respect  to  addition. 

Theorem  15.  First  proved  by  Stephanos, 

(At  X  B,)(i4,  X  B,)  . . .  (A,  X  B,)  ^  (A^A,-..  A,)  X  (B.B,  ...B,) 


provided  the  X’s  and  B'a  are  multiplicative  among  themselves,  and 
certain  elements  are  commutative.  For 


(.4,  X  Bi)(At  X  Bt) 


d) 


0 


(t) 

*1 


by  theorem  (1)  and  the  above  provision.  n^w,  f>ow  *»  w*(i) 

“  4>4i4t0«ia]  by  theorem  (5) 

™  AiAt  X  BiBi 


The  next  step  (AiAt  X  BiBt)iAi  X  Bt)  is  found  by  repeating  the 
above  process. 

This  theorem,  (and  the  theorem  on  ranks,  below)  reveals  the  entire 
lack  of  interaction  of  different  generations  on  each  other.  Thus  the 
h’th  factor  of  a  composite  product  has  its  effect  (in  multiplication  with 
another  composite  matrix)  solely  in  the  A’th  generation. 

Theorem  16.  The  adjoint,'  inverse,'  transpose,  or  hermitian  con- 

•  Proved  tor  A  X  B,  A  and  B  square  by  Rice,  Adjoint  etc.,  loc.  eit.  Theorem 
19  v»iil  show  that  there  is  an  inverse  only  when  all  the  factors  are  non-singular, 
and  that  the  adjoint  of  a  composite  product  which  happens  to  be  square  although 
made  up  of  rectangular  factors,  is  zero. 


INDETERMINATE  PRODUCT  OF  MATRICES 


445 


jugate  of  a  composite  matrix  is  the  composite  product  of  the  adjoint, 
etc.,  of  the  factors. 

(Ai  XA,X  X  A,y  -  XA*,X  ...  X  A*) 

Proof. 

(A  X  B)*  =  =  0*.«I>4«  by  theorem  6 

The  orders  of  the  last  two  matrices  (see  convention,  page  437)  are 
n^ria*  and  m^^ma,  n4>ma. 

By  theorem  1,  this  is 

A*  X  B*. 

To  prove  the  theorem  for  the  continued  product,  let 
B-  i4,  X  .4,  X  .  .  .  X 

B*  is  found  by  repeating  the  procedure  p  —  1  times. 

The  composite  product  of  the  inverses  of  the  corresponding  factors  of 
Q  is  both  a  left-  and  right-handed  inverse.  There  are  also  “bilateral” 
inverses,  for  example  (i4~*  X  /«)  (  )(/b  X  B~')  operating  on  ^4  X  B 
gives  /*.  In  general  the  operator, 

(i47*  or/,)  X  or/,)  X  ... 

•  X(i47‘or/p)(  )(/iori47‘)  X  ...  X  (/pOri47*) 

where  i47‘  or  /*  appears  once  and  only  once  (and  then  as  the  h'th 
factor),  is  a  “bilateral”  inverse  of  Q.  The  pre-  or  post-factors  of  these 
bilateral  inverses  form  “partial”  inverses,  inverses  only  in  certain 
generations. 

Theorem  17.  The  determinant  of  a  continued  composite  product  is 
the  product  of  determinants  of  the  factors,  each  raised  to  the  power 
which  is  the  product  of  the  orders  of  all  the  other  factors.* 

^  X  B  =  EOaEGb  -  I  il  1"*  I  B  I"-* 

In  the  general  case  by  equation  (14)  and  (15)  and  theorems  2  and  3. 

n  =  X  .4,  X  .  .  .  X  =  II  «!>*  . (21) 

k 

*  This  theorem  for  two  factors  has  long  been  known:  K.  Hensel,  Acta  math.,  14, 
317-9,  1891,  E.  Netto,  id.  17,  199-204,  1893,  R.  D.  von  Stemeck,  Mh.  Math.  u. 
Phys.,  6,  205-7,  1895.  Oldenburger  toe.  cit.,  states  the  general  theorem  by  in¬ 
duction. 


446 


G.  W.  KING 


where  K  is  the  subscript  Or-i  ending  in  64*  •  4>i  ~  Ai.  The  order  of  4>k 
is  IXy  mAj,  j  —  /C  4-  I  to  p,  and  the  order  of  0*  is  n4c_i.  By  theorem 
3  these  functions  are  simplified  to 

n  *6^^  *  -  1  to  p  . (22) 


where  the  order  of  is 

n  ^Af  j  -  1  to  »  —  1 

i 

Taking  determinants 

ini-ni+..j-n  W'*  . (23) 

i  *  i 

9 

iv<  -  n  • 


RANK 

Let  us  return  again  to  the  view  that  the  composite  is  derived  from  a 
4-way  matrix  for  the  purpose  of  comparing  the  various  conventions  of 
forming  two  bipartite  indices'* —  which  are  nothing  more  than  the  ways 
of  representing  the  four  dimensional  matrbc  on  paper  by  2-way  couches 
of  various  aspects.  The  resulting  rank  depends  on  the  choice. 

A  matrix  R  «  ||  ||  gives  rise  to  three  2-way  representations — 

ij,  ik,  tf-couches  are  rows  (the  kl,  jl,  jk-oouches,  forming  the  columns  are 
oontraspective  and  have  the  same  rank  as  their  counterlying  couches.") 

Theorem  18.  In  the  indeterminate  product  of  two  matrices  the  rank 
on  ij  is  unity.  The  ik  and  il  ranks  are  equal  and  equal  to  the  product 
of  the  ranks  of  the  factors.  Thus  the  rank  of  a  composite  (2-way) 
matrix  is  exactly  equal  to  the  product  of  the  ranks  of  the  factors. 

Proof.  The  tj-couches  are  the  intersection  of  layers  of  aspect  t  and 
aspect  vis.  the  partitions  of  i',  which  are  all  proportional. 

The  number  of  independent  iJt-couches  is  found  as  follows:  The 
i’th  row  of  A  is 

Oi  ai/O/  t  —  1  to  m4 . (24) 

and  j  refers  only  to  Ta  independent  rows  of  A  say  j  »  1  to  . 

"  The  use  of  one  simple  and  one  tripartite  index  is  not  particularly  enlightening. 

"  F.  L.  Hitchcock,  The  expreuion  of  a  tetuor  or  a  polyadie  <u  a  $um  of  prodvctt, 
Jnl.  of  Math,  and  Phys.,  6,  164-180,  1927.  Also,  Rice,  Coueke  rankt  etc.,  loc  eit. 


INDETERMINATE  PRODUCT  OF  MATRICES 


447 


Then  an  —  only  when  i  also  refers  to  the  chosen  independent 
rows  of  i4.  Similarly  for 

frlto"'  . 

The  (tik)’th  row  of  4'  which  is  (the  bars  separate  partitions) 

Sii  b<  I  aa  I  •  •  •  I  hi  . (26) 

can  be  written  as  the  sum 

Oii  fiki  hi  I  atj  0ntijtbl  \  •  •  •  I  Oil  0U  &inA  hi  . (27) 


i.e. 


^lik)  *  aiifiki't'un  . (28) 

Thus  there  are  no  more  than  ta  tb  independent  rows  in  'F:  there  are  no 
less,  for  if  we  interpret  the  e’s  of  the  introductory  paragraph  as  Gibbs' 
unit  matrices, 

II  (Su6ji)n  II  1-1  to  mA,a  j  -  1  to 

those  e’s  where  s,t  refer  to  elements  in  a  minor  of  rank  Ta  (or  r*)  form  Ta  r» 
linearly  independent  symbols  (eiej)  in  the  indeterminate  or  composite 
product  of  A  and  B,  by  definition  of  these  products.  ** 

Theorem  19.  In  the  2p-way  matrix 

n  -  X  X  . ..  X  X  ...  X  A,  . (4) 

the  consequence  of  continued  indeterminate  products,  the  p-way  couches 
whose  locants  contain  one  index  from  each  factor  have  rank  equal  to  the 
product  of  the  ranks  of  the  factors;  those  couches  of  class  2(p  —  q) 
which  have  aspects  containing  q  pairs  of  indices,  each  pair  from  one 
factor,  have  rank  unity. 

Proof.  First  consider  the  2-way  matrix  0,  (4)  whose  rows  are  the 
*1  ...  tp-couches  (therefore  the  columns  are  ji  •••  jp-couches).  Let  the 
factor  matrices  Ak  have  rows,  columns  and  be  of  rank  r*. 

Then 

®  <***  . (29) 


■*  Oldenburger,  loe.  eit.,  has  shown  the  rank  of  a  composite  product  is  less  than 
or  equal  to  the  product  of  the  ranks  of  the  factors.  In  a  private  communica¬ 
tion  to  Prof.  Hitchcock  he  very  kindly  points  out  that  the  inequality  may  be 
removed  by  the  lemma  in  his  paper. 


448 


G.  W.  KINO 


where  kk  refers  only  to  r*  independent  rows,  and  is  constant  through¬ 
out  a  row.  Only  when  t*  also  refers  the  one  of  the  independent  rows 
will 


Sul)etituting  (29)  in  the  (I’l  •  •  •  ip)(ji  •  •  •  y^)'th  row  of  Q,  this  row  is 
converted  into  a  sum,  in  each  term  of  which  we  may  collect  the  a’s  into 
one  coefficient,  since  each  a  appears  in  every  element  of  the  row. 


J»)  -(*) 


(30) 


There  are  then  exactly  nr*  •  •  •  r,  times  when  this  sum  reduces  to  an 
identity.  These  rows,  formed  from  the  independent  rows  of  the  factors, 
cannot  l)e  linearly  dependent  from  the  very  indeterminacy  of  the 
product.  Therefore  the  rank  of  Q  is  precisely  the  product  of  the  ranks 
of  the  factors. 

Now  consider  the  2-way  matrix  whose  rows  are  the  mixed  index 
(i  or  j)i  •  •  •  (i  or  j)p-couche8.  These  rows  are  the  »i  •  •  •  tp-couches  of 
the  2-way  matrix 

(i4,ori4l)  X  (.4,  oriij)  X  •••  XiA,  or  A',)  . (31) 


For  every  in  the  mixed  index  w’e  have  the  transpose  of  Ak  in  (31). 
But  the  number  of  independent  rows  in  a  transpose  is  the  same  as  in  the 
original  matrix.  Thus  the  rank  of  (31)  is  the  same  as  the  rank  of  12; 
ergo,  the  rank  of  the  mixed  indices  is  the  same  as  on  tit*  •  •  •  i,. 

Finally,  if  q  pairs  of  aspect-indices  are  taken  from  the  same  q  matrix 
factors,  the  rows  of  the  29-way  couche  representation  of  the  product  will 
be  proportional,  the  elements  of  the  q  factors  being  held  constant  in 
each  row,  and  the  2(p  —  q)  indices,  being  the  locant  of  the  couche,  take 
on  their  values  in  the  different  columns. 


COMPOSITE  DIVISION 

The  condition  for  a  matrix  to  be  a  composite  of  two  2-way  matrices 
is  for  it  to  be  capable  of  being  written  as  a  4-way  matrix  (thus  neither 
the  number  of  rows  nor  the  number  of  columns  may  be  prime),  and  the 
couche  rank  on  some  two  indices  be  unity.  (The  rank  on  the  contra- 
spective  indices  will  also  be  unity.)  These  will  be  the  two  indices  of  a 
factor,  the  other  factor  of  the  composite  product  l>eing  proportional  to 
the  couches,  the  proportionality  constants  l>eing  the  elements  of  the 
first  factor.  The  post-factor  will  form  the  dependent  blocks  (all  of  the 
same  size)  in  the  original  matrix. 


INDETERMINATE  PRODUCT  OF  MATRICES 


449 


Theorem  20.  If  the  number  of  rows  of  a  matrix  is  a  product  of  p 
factors,  and  the  numl)er  of  columns  is  also  a  product  of  p  factors  the 
matrix  can  be  written  as  a  2p-way  matrix,  P.  If  there  are  I-way  couches 
of  aspect  A;i  •  •  •  so  that  this  couche  rank  is  equal  to  unity,  the 
matrix  is  a  composite  of  a  <-way  and  a  p-I-way  matrix.**  The  couche 
ranks  of  the  factors  may  similarly  be  investigated.  It  follows  that  if  the 
indices  of  P  can  be  divided  into  r  groups  of  I*  indices  each,  h  ^  1  to  r, 
so  that  the  (/^-couche  ranks  for  every  value  of  h  are  unity,  there  are 
r  p-/*-way  factor  (A  =  1  to  jr)  matrices  whose  indeterminate  product 
gives  P. 

E.  B.  Wilson  says  in  the  discussion  of  the  indeterminate  product  of 
vectors** 

“The  most  general  product  conceivable  ought  to  have  the  property 
that  when  the  product  is  known  the  two  factors  are  also  known.” 
Theorem  20  essentially  states  that  apart  from  a  scalar  we  can  solve  an 
indeterminate  product  for  all  its  factors.  The  provisor  must  be  added 
since  the  indeterminate  product  of  two  matrices  is  the  most  general 
product  in  which  scalar  multiplication  is  associative, 

kiA  X  B)  -  (ifc.4)  X  B  ^  A  X  (kB)  =  {k'A)  X  (A:”B) 

Corollary  1.  If  A  X  B  ^  then  A  «  kh  and  B  =  Ar'/b  where  k 
is  a  scalar. 

Corollary  2.  IfylXB»/fXiS  then  B  ^  T  X  S  and  R  ^  A  X  T 
where  T  may  be  a  scalar.**  Thus  if  A  and  B  are  rwt  composite,  R  is 
proportional  to  A,  and  S  is  proportional  to  B. 

Corollary  3.  There  are  no  divisors  of  zero. 

This  theorem  is  responsible  for  the  existence  of  the  converse  of  the 
following  theorem. 

Theorem  21.  If  the  factors  are  orthogonal,  unitary  or  hermitian,  so  is 
n  -  X,  X  i4,  X  •  •  •  X  ^p. 

Proof.  B  X  C  »  ^  is  a  product  of  orthogonal,  unitary  or 

hermitian  matrices,  theorems  4  and  6.  To  prove  the  theorem  let 
B  ^  Ai  and  C  »  At;  repeat  with  B  »  (i4i  X  A\)  and  C  »  At,  and  so  on. 

Converse.  If  fi  is  orthogonal,  unitary  or  hermitian,  so  are  aU  its 
factors,  aside  from  scalar  factors. 

Proof.  When  0  is  hermitian, 

AtX  AtX  X  A,  ^  A[  X  At  X  ■■  X  A', 

**  This  is  parallel  to  a  theorem  in  polyadics  of  Hitchcock,  id.  p.  177. 

**  E.  B.  Wilson,  Vector  analyait,  New  Haven,  p.  272. 

>*  Some  work  was  done  on  this  equation  by  Rutherford,  fpe.  eil. 


450 


G.  W.  KING 


by  theorem  16.  Then  by  theorem  20 

Ai  «  Ai,  i4j  *  Af,  Ap  «  Ap 

anidc  from  scalare.  For  the  other  caaes 

U,  X  X  •  •  X  XA*X  XAl)-^Ia 

^  hi  X  hf  X  •  •  •  X  hp  by  theorem  20 

and  is  also  *  AiA*  X  A^At  X  •  •  •  X  ApA*  by  theorem  15. 

Thus  by  theorem  20  again,  i4,i4t  —  hi,  A^A*  =  hu  •  •  •  ApA*  »  hp. 

Theorem  22.  If  the  factors  are  diagonal,  so  is  Q. 

Converse.  If  0  is  diagonal  so  are  all  the  factors;  for  if  the  A’th  is  not, 
the  main  diagonal  partitions  of  the  A’th  generation  will  not  be  diagonal 
matric(‘s. 

Theorem  23.  If  the  factors  are  triangular,  so  is  0.  The  converse  is 
not  true  unless  all  the  partitions  of  the  p’th  generation  that  lie  below  the 
main  diagonal  be  zero,  and  all  those  that  lie  on  it  or  above  be  triangular. 

Theorem  24.  If  the  transformations  reduce  i4  to  a  triangular 

form  T,  and  K~'BK  reduce  to  a  triangular  form  V,  then  the  operation 

(//~*  X  K~'){  )(//  X  K)  =  S“*(  )iS  w’ill  reduce  ♦  to  the  triangular 
form  T  X  F,  by  theorem  15,  and  the  last.  A  similar  theorem  holds  for 
operations  reducing  matrices  to  diagonal  forms.f 

Theorem  25.  If  the  roots  of  A  and  B  are  and  X«  respectively,  the 
roots  of  ♦  are  then  n»  roots  —  X4  X».  The  proof  follows  from  the 
last  theorem,  and  the  fact  that  the  roots  are  the  main  diagonal  elements 
in  the  triangular  forms. 

The  statements  that  follow’  are  also  true  if  the  letters  A  and  B  are 
interchanged  where  ever  they  occur. 

If  X4  =0  then  there  are  n«  roots  X*  =  0 

If  X4  =  1  then  there  are  n»  roots  X^  ■■  X» 

Thus  a  composite  matrix  can  only  have  NnA  zero  roots,  where  is  an 
integer,  =  0,  1  •  •  •  na. 

There  are  definite  conditions  for  ♦  to  have  distinct  roots.  No  roots 
of  the  factors  may  be  zero  or  unity;  the  roots  of  A  and  the  roots  of  B 
must  be  distinct.  A  necessary  condition  is  then  that  of  the  n.4na  roots 
\a  Xa  no  more  than  two,  one  in  each  set,  can  be  identical.  This  is  not  a 
sufficient  condition,  for  if  the  ratio  of  two  roots  of  A  be  equal  to  the  ratio 
of  two  roots  of  B,  there  will  be  two  identical  roots  of 

Theorem  26.  The  matrices  A  X  B,  A'  X  B,  A  X  B\  A'  X  B', 

t  It  in  always  pnssible  to  find  an  <S  —  X  K  to  redure  4'  to  a  triannular  form. 


INDETERMINATE  PRODUCT  OF  MATRICES 


451 


B  X  A,  B  X  A\  B'  X  A  and  B'  X  A'  are  all  equivalent,  and  may  be 
converted  into  each  other  by  permuting  the  rows  and  columns,  (and  in 
particular  by  hermitian  operations). 

Proofs.  E\{A  X  B)Et  ^  B  X  A  in  nAlrngl  different  ways 

niAmB,  m*m.i  •  tnAma,  UaTIb  •  tiatib,  nBtiA  «  n«n4 

For  the  left  handside  may  be  expanded  as 

El  «l»^  E,  E,  Qb  E,  -  e?  >  <I>7  ’  . (32) 

mBfriA,  mATtiB •  mAfnB,  nAtriB •  nAmBmBfiA  •  mBtiA,  riAmB  •  tiAniB,  fiAtiB  •  n^n^,  ubtIa 

«  mBfriA,  mBitA-niBnA,  nBTiA 

Or,  more  concisely,  »  rieW  »  and  ne  «  n«(?)  »  by  theorem 
(1)  and  analogy  with  equation  (8a). 

The  other  transformations  may  be  brought  about  by  introducing 
another  type  of  transposed  identity  matrices  (in  the  cqmposite  mani¬ 
fold).  Let  /"be  ||  (iify)i,  ||  where  the  permutation  y  is  the  reverse  of 
the  natural  order.  /"  is  involutory  and  hermitian.  The  operation 
I" PI*'  gives  the  transpose  of  P;  /".  alone  reverses  the  order  of  the  rows, 
and  ./"  the  order  of  the  columns.  Then  let  6/"  =■  J  and  ♦/«»  *  J”  * 
EiJEt.  The  matrices  j  are  also  involutory  and  hermitian  transposed 
identity  matrices.  We  have  the  following  type  of  equation: 


J  (AX  B)J  ^  AX  {r'BP')  ^  A  XB'  . (33) 

J'\A  X  B)J”  ^  A’ X  B  . (34) 

r\A  X  B)Ji  -  EiJtEM  X  B)Jx  =  ExJtiB  X  A)EbJi . (36) 


The  matrix  EJ  is  a  permutation  matrix  similaY  to  type  E, 

»  6ki6ay 

Corollary  1. 

E,(B  X  i4)E,  ^  AXB 
and  Ei{A  X  B)  ^  {B  X  A)Et 

The  bipartisation  of  indices  and  suf)8equent  direct  or  dot  product  of 
the  resulting  matrices  can  be  interpreted  as  the  introduction  of  a  double 
dot  product  in  the  n*-manifold  of  4-way  matrices  (not  necessarily  com¬ 
posite).  In  all  the  previous  equations  the  matrix  multiplication  should 
be  indicated  by  double  dots. 

R;  S  =  r<x»»8xy»/ 


452 


O.  W.  KING 


where  the  dot«  (mimmation)  refer  to  the  second  and  fourth  indices  (or 
vectors  in  the  case  of  double  polyadics)  of  the  prefactor,  and  the  third 
and  first  of  the  postfactor.  In  the  case  the  factors  are  composite  this 
is  the  basis  of  the  theorem  15, 

Ai  X  X  -  Ai-A,  X  Bi-Bt 

We  could  define  the  double  dot  products  to  refer  to  any  index.  The 
double  dot  :  in  double  dyadic  multiplication'*  refers  to  the  last  two 
vectors  (indices)  of  the  prefactors  and  the  first  two  of  the  postfactor. 

Ai  X  Bi:*'**At  X  Bt  =  BiiAtAi  X 
Or  we  may  have  the  dots  referring  to,  say,  the  23  and  14  indices 

AiX  Bi;*'**A,X  Bt  ^  Ai  A,X  BfBi  . (36) 

and  so  on,  there  being  twenty-four  different  double  multiplications  of 
4-way  matrices. 

There  are  in  all  twelve  kinds  of  double  multiplications  of  a  4-way 
matrix  with  a  2-way  matrix,  some  of  which  are  useful  in  the  discussion 


of  the  linear  matrix  equation, 

d(x)  «  ZiAixBi  «  C  nAi  ^  m,  »  n,  . (37) 

which  can  be  written 

\\Zi(AfX  B[)\\;**x  . (38) 

or  ||2<(^,  X  B*)  II  ;«x  . (39) 


where  ;  •'*  indicates  summation  of  the  first  and  second  indices  of  x  with 
the  o’th  and  6’th  of  the  4-way  matrix.” 

The  first  solution  of  (37)  in  a  finite  number  of  steps  was  obtained  by 
Hitchcock'*  by  means  of  ^e  transformation. 

d(x)  «  aa'»x«bb'  »  ab'aT):**x  *=  . (40) 

As  ip  has  the  same  Hamilton-Cayley  equation 
2,  (-l)'m„  X*-'  -  0 

. (41) 

(p  ■■  0  to  n  B  man.) 

**  F.  L.  Hitchcock,  A  tolulion  of  the  linear  matrix  equation  by  double  multiplica¬ 
tion,  Proc.  Nat.  Acad.  8ci.,  8,  78-83,  1921  and  On  double  polyadice,  with  application 
to  the  linear  matrix  equation,  Proc.  Am.  Acad.  Art  and  Sci.,  88,  358-95, 1923. 

"  In  spite  of  Roth,  .MacDuffee,  foe.  eit.,  p.  89  is  correct  in  writing  the  prefactor 
of  (39)  as  the  matrix  of  coefficients  of  the  unknown  elements  of  x  when  as  he  says, 
he  writes  the  latter  in  the  proper  order. 


INDETERMINATE  PRODUCT  OF  MATRICES  453 

aa  0,  Hitchcock’s  method  was  to  evaluate  the  coefficients  of  this  equa¬ 
tion.  When  the  extent  is  unity  they  are,  (the  subscript  «  indicates  the 
trace) “ 

mo  1 

mi  s  A,B, 

2m,  -  A\B]  -  iA^).  (fi*). 

6m,  ^  A)B\  -  ZA.B,(A*UB').  +  . 

24m4  -  A*,B\  -  M\B](A*UB*),  -f 

+  %AM,{A*).{B*),  -  ^{A*).{B*), 


Now  as  a  supplement  to  theorem  26,  given 

A  X  . (43) 

we  find  A  X  fl is  identical  with  ip  . (44) 

where 


/»  ->  II  iiikitdiikl  II 

the  identical  double  dyadic,  i.e.  the  tpforA  »  /«  and  B  ^  h,  is  involutory 
and  hermitian  in  multiplication,  in  which  the  identity  operator  is 


If  by  (36).  (44)  is  easily  demonstrated  by  expanding: 

II  is^ohi^6\ki^i)iiu  II  =  II  (a,,b</)o'u  ||  =  v’  . (44a) 

Given  the  Hamilton-Cayley  equation  for  <p  we  may  carry  out  the 
transformation 

y;«"(^  -  X/,)  U  J  A  X  B  -  \If  . (45) 


Thus  A  X  B  has  the  same  Hamilton-Cayley  equation  as  ip,  since  the 
index-permuting  matrices  and  J  have  determinants  equal  to  unity. 
By  similar  detailed  transformations  we  can  state  that  the  matrices 
obtained  by  reflections  of  the  indeterminate  product  A  X  Bin  four-space 
have  the  same  Hamilton-Cayley  equation  provided  certain  double 
multiplications  or  couche  representations  are  used. 

Theorem  27.  The  matrices  A  X  B,  A'  X  B,  A  X  B',  A*  X  B\ 
B  X  A,  B  X  A',  B’  X  A,  B'  X  A'  and  the  double  dyadic  ip  have  the 
same  Hamilton-Cayley  equation,  with  coefficients  given  by  (42).  The 


**  Recursion  and  explicit  formulae  will  be  given  in  a  later  paper. 


I 


454 


G.  W.  KINO 


coefficients  of  the  Hamilton-Cayley  equation  of  a  sum  of  composite 
matrices  are  the  same  as  those  of  the  corresponding  given  by  Hitch* 
cock. 

Axes  (“eigenvectors”)  are  defined  by  equations  of  the  type 


»  XaTa  . (46) 

B-tb  X*r,  . (47) 


If  we  consider  the  r’s  as  one  column  matrices,  we  may  write 

■  A-r^  X  B’Tb  =  (A  X  B)-*'**  {ta  X  r,)  »  X4X#(r^  X  r*) 

i.e.  X^r^  by  theorems  15  and  25.  We  see  r^,  and  axis  of  is 

still  a  one  column  matrix,  or  vector  in  tiAriB  dimensions.  In  this  respect 
our  convention  in  writing  out  the  indeterminate  product  of  Ta  and  r« 
differs  from  that  of  Gibbs  and  so  justifies  the  use  of  a  new  symbol. 
Nevertheless  we  may  write  the  equation  as 

A  X  B  r.,  I  r.  -  A  X  B  ;««  «  -  X.R 

R  being  a  dyad,  or  matrix  of  rank  one.  When  multiplying  into  2-way 
matrices  is  the  same  as  defined  above,  the  2nd  and  4th  indices  of  t 

the  post-factor  being  dummies.  Hence  « 

A-B  B'  -  X,B 


the  truth  of  which  is  evident  from  a  dyadic  decomposition 
A-r^  I  r«-B'  »  X^r.!  |  X»r. 

and  since  ra«B'  »  B*r«,  this  is  the  indeterminate  product  of  equations 
(46)  and  (47). 

If  there  are  roots  of  a  matrix  which  are  identical,  any  linear  combina¬ 
tion  of  the  corresponding  axes  is  an  axis  of  the  matrix.  Thus  the  axes 
of  a  composite  matrix  may  either  be  conceived  as  vectors  in  n^ns-space  or 
as  an  714  by  tis  matrix  of  rank  equal  to  the  number  of  identical  roots. 

The  axes  of  a  4-way  matrix  may  be  considered  to  be  2-way  matrices. 
The  roots  of  the  2p-way  matrix  0  are  the  ftinj  -  n,  roots 


o 

p 

c\ 


and  the  axes  are  p-way  matrices  which  can  be  represented  by  a  sum  of  M  t 
p-ads  where  M  is  the  number  of  identical  roots  of  Q. 


CO. 


j 


